如前所述,最简单的方法是将数组转储到一个文件中,然后将该文件作为numpy数组加载。在

首先,我们需要一份庞大的清单:

huge_list_size = len(huge_list)

接下来我们将其转储到磁盘

^{pr2}$

如果这些都发生在同一环境中,请确保清除内存del huge_list

接下来我们定义一个简单的读生成器

def read_file_generator(filename):
with open(filename) as infile:
for i, line in enumerate(infile):
yield [i, line]

然后我们创建一个由零组成的numpy数组,用刚刚创建的生成器填充它

huge_array = np.zeros(huge_list_size, dtype='float16')
for i, item in read_file_generator('huge_array.txt'):
huge_array[i] = item

我先前的回答不正确。我建议将以下内容作为一个解决方案,但它并不像hpaulj注释的那样You can do this in a multiple ways, the easiest would be to just dump

the array to a file and then load that file as a numpy array:dumpfile = open('huge_array.txt', 'w')

for item in huge_array:

print>>dumpfile, item

Then load it as a numpy arrayhuge_array = numpy.loadtxt('huge_array.txt')

If you want to perform further computations on this data you can also

use the joblib library for memmapping, which is extremely usefull in

handling large numpy array cmputations. Available at