Received a label value of -2147483648 which is outside the valid range of [0, 5). #15

parkourcx · 2018-07-28T13:53:43Z

Bi-directional lstm中文分词里，报错tensorflow.python.framework.errors_impl.InvalidArgumentError: Received a label value of -2147483648 which is outside the valid range of [0, 5). Label values: -2147483648 -2147483648 2 3 -2147483648 0 0 0 0 0 0 0 0 0 0 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 2 3 -2147483648 0 0 0 0 0 0 0 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 2 3 -2147483648 0 0 0 0 0 0 0 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 -2147483648 2 3 -2147483648 0 0 0 0 0 -2147483648 -2147483648 -2147483648 -2147483648 ...等等等].我用的是自己的数据集，处理的跟样例数据集一样的形式（今/B 天/M是/M个/M好/E3天/E2气/E），结果报这个错，请问是否是我的数据集中的句子长度过长？该如何解决？

yongyehuang · 2018-08-04T06:34:18Z

@parkourcx 你好，感谢提问。你的这个问题应该不是句子长度的问题，而是数据处理中每个字的label标注的不对。我记得标注中只用了 s b m e 四种标注分别表示： s- 单字成词, b- 词首, m-词中,e-词尾；对于 padding 部分统一使用 x 作为标注。从你的报错来看你的 label 有些 -2147483648 应该是不对的，还有我也不太明白（今/B 天/M是/M个/M好/E3天/E2气/E） 为什么这样标注。

parkourcx · 2018-08-04T06:44:38Z

非常感谢回复！是这样的，我在做一个古汉语断句的程序，这样写是为了标注到每个古汉语的开头中间和结尾，即/S单字成句，/B句子的开始，/M句子的中间，/E句子的结尾，我觉得断句和分词其实都是序列切割问题，所以您的程序经过调整应该可以实现古汉语断句，我这样理解对吗？后来我重新调整了语料，发现确实是我的label有问题，现在程序已经没问题了，正在训练模型。另我还想请教一个问题，如果我用6元标注集的话（S B M E3 E2 E，分别代表单字成句，开始，中间，句子倒数第三个字，倒数第二个字，结尾），除了预处理语料部分要做相应的改变以外，模型部分需要做什么更改吗？期待您的回复，祝好！ yongyehuang <[email protected]>于2018年8月4日周六14:34写道：

@parkourcx <https://github.com/parkourcx> 你好，感谢提问。你的这个问题应该不是句子长度的问题，而是数据处理中每个字的label标注的不对。我记得标注中只用了 s b m e 四种标注分别表示：s- 单字成词, b- 词首, m-词中,e-词尾；对于 padding 部分统一使用 x 作为标注。从你的报错来看你的 label 有些 -2147483648应该是不对的，还有我也不太明白（今/B 天/M是/M个/M好/E3天/E2气/E）为什么这样标注。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALEUoDAFiA-DDC1oMrg7yMJTYcJPWu3Uks5uNUBsgaJpZM4VlDKJ> .

-- Sent from my iPhone

yongyehuang · 2018-08-04T06:50:14Z

@parkourcx 这样的话应该没有什么问题，你可以比较一下这样的标注和只使用 s b m e 四tag标注的方式看看那个效果好。模型的话这个模型也是比较简单的模型，你也可以尝试一下 lstm+crf 的模型（我自己也没跑过。。。），序列标注中用得还是比较多的。

parkourcx · 2018-08-04T06:56:29Z

好的，我会尝试一下，还有刚才您说的“对于 padding 部分统一使用 x 作为标注 ”我不是很明白，我把源程序里的tags=[‘s’,‘b’,‘m’,‘e’,‘x’]改成了tags=[‘S’,‘B’,‘M’,‘E’]，会有什么影响吗？ yongyehuang <[email protected]>于2018年8月4日周六14:50写道：

@parkourcx <https://github.com/parkourcx> 这样的话应该没有什么问题，你可以比较一下这样的标注和只使用 s b m e 四tag标注的方式看看那个效果好。模型的话这个模型也是比较简单的模型，你也可以尝试一下lstm+crf 的模型（我自己也没跑过。。。），序列标注中用得还是比较多的。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALEUoLrJ0gur4d6OwxYSQ9mNQl6ananTks5uNUQngaJpZM4VlDKJ> .

-- Sent from my iPhone

yongyehuang · 2018-08-04T06:59:39Z

@parkourcx padding 是为了把每个样本变成一样的长度，对于长度不足的部分序列要使用一个特殊符号进行补充，这个特殊符号都标注为一个新的label，所以你还是使用 tags=[‘s’,‘b’,‘m’,‘e’,‘x’] 吧。

parkourcx · 2018-08-04T07:02:15Z

我现在没有x这个tag，结果是不是就完全不对了？而且class_num=5就是因为有5个标签的缘故吧？ yongyehuang <[email protected]>于2018年8月4日周六14:59写道：

@parkourcx <https://github.com/parkourcx> padding 是为了把每个样本变成一样的长度，对于长度不足的部分序列要使用一个特殊符号进行补充，这个特殊符号都标注为一个新的label，所以你还是使用 tags=[‘s’,‘b’,‘m’,‘e’,‘x’] 吧。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALEUoEmT-LSr3AECYhOKhFqkQfM6fSk9ks5uNUZcgaJpZM4VlDKJ> .

-- Sent from my iPhone

yongyehuang · 2018-08-04T07:11:34Z

@parkourcx 'x' 是在代码处理中加上的tag，不是标注数据中的 tag

parkourcx · 2018-08-04T07:17:17Z

如果tags这个list里没有x，会有什么影响呢？那么class_num就应该是4而不是5了？现在情况是我预处理的时候只有tags里只写了SBME这四个，我需要加上x再重新处理一遍语料吗？ yongyehuang <[email protected]>于2018年8月4日周六15:11写道：

@parkourcx <https://github.com/parkourcx> 'x' 是在代码处理中加上的tag，不是标注数据中的 tag — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#15 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ALEUoNCQLl8h38U7RXui1P616zp8rr7cks5uNUkngaJpZM4VlDKJ> .

-- Sent from my iPhone

parkourcx · 2018-08-10T12:15:04Z

打扰一下，我想请教一下在求转移状态矩阵之前所设的A = { 'SB': 0, 'SS':0, 'ES':0, 'BE': 0, 'BM': 0, 'ME': 0, 'MM': 0, 'EB':0 } ，这里的SS SB ES等指的是什么意思，我不是很理解 Chen xiang <[email protected]>于2018年8月4日周六15:17写道：

如果tags这个list里没有x，会有什么影响呢？那么class_num就应该是4而不是5了？现在情况是我预处理的时候只有tags里只写了SBME这四个，我需要加上x再重新处理一遍语料吗？ yongyehuang ***@***.***>于2018年8月4日周六15:11写道： > @parkourcx <https://github.com/parkourcx> 'x' 是在代码处理中加上的tag，不是标注数据中的 tag > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#15 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ALEUoNCQLl8h38U7RXui1P616zp8rr7cks5uNUkngaJpZM4VlDKJ> > . > -- Sent from my iPhone

-- Sent from my iPhone

parkourcx · 2018-08-10T12:32:36Z

是这样求转移概率矩阵吗？ Chen xiang <[email protected]>于2018年8月10日周五20:14写道：

打扰一下，我想请教一下在求转移状态矩阵之前所设的A = { 'SB': 0, 'SS':0, 'ES':0, 'BE': 0, 'BM': 0, 'ME': 0, 'MM': 0, 'EB':0 } ，这里的SS SB ES等指的是什么意思，我不是很理解 Chen xiang ***@***.***>于2018年8月4日周六15:17写道： > > 如果tags这个list里没有x，会有什么影响呢？那么class_num就应该是4而不是5了？现在情况是我预处理的时候只有tags里只写了SBME这四个，我需要加上x再重新处理一遍语料吗？ > > yongyehuang ***@***.***>于2018年8月4日周六15:11写道： > >> @parkourcx <https://github.com/parkourcx> 'x' 是在代码处理中加上的tag，不是标注数据中的 tag >> >> — >> You are receiving this because you were mentioned. >> Reply to this email directly, view it on GitHub >> <#15 (comment)>, >> or mute the thread >> <https://github.com/notifications/unsubscribe-auth/ALEUoNCQLl8h38U7RXui1P616zp8rr7cks5uNUkngaJpZM4VlDKJ> >> . >> > -- > Sent from my iPhone > -- Sent from my iPhone

-- Sent from my iPhone

parkourcx changed the title ~~ValueError: setting an array element with a sequence.~~ Received a label value of -2147483648 which is outside the valid range of [0, 5). Jul 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Received a label value of -2147483648 which is outside the valid range of [0, 5). #15

Received a label value of -2147483648 which is outside the valid range of [0, 5). #15

parkourcx commented Jul 28, 2018 •

edited

Loading

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

parkourcx commented Aug 10, 2018 via email

parkourcx commented Aug 10, 2018 via email

Received a label value of -2147483648 which is outside the valid range of [0, 5). #15

Received a label value of -2147483648 which is outside the valid range of [0, 5). #15

Comments

parkourcx commented Jul 28, 2018 • edited Loading

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

yongyehuang commented Aug 4, 2018

parkourcx commented Aug 4, 2018 via email

parkourcx commented Aug 10, 2018 via email

parkourcx commented Aug 10, 2018 via email

parkourcx commented Jul 28, 2018 •

edited

Loading