Program Tip

numpy 배열에서 다른 배열로 데이터를 복사하는 방법

programtip 2020. 10. 16. 08:01
반응형

numpy 배열에서 다른 배열로 데이터를 복사하는 방법


배열 a의 주소를 수정하지 않고 배열 b에서 배열 a로 데이터를 복사하는 가장 빠른 방법은 무엇입니까? 외부 라이브러리 (PyFFTW)가 변경할 수없는 배열에 대한 포인터를 사용하기 때문에 이것이 필요합니다.

예를 들면 :

a = numpy.empty(n, dtype=complex)
for i in xrange(a.size):
  a[i] = b[i]

루프없이 할 수 있습니까?


나는 믿는다

a = numpy.empty_like (b)
a[:] = b

딥 카피를 빠르게 만들 것입니다. Funsi가 언급했듯이 최신 버전의 numpy에도 copyto기능이 있습니다.


numpy 버전 1.7에는 원하는 작업을 수행하는 numpy.copyto 함수가 있습니다.

numpy.copyto (dst, src)

한 배열에서 다른 배열로 값을 복사하고 필요에 따라 브로드 캐스팅합니다.

참조 : http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.copyto.html


a = numpy.array(b)

numpy v1.6까지 제안 된 솔루션보다 훨씬 빠르며 배열의 복사본도 만듭니다. 그러나 최신 버전의 numpy가 없기 때문에 copyto (a, b)에 대해 테스트 할 수 없습니다.


귀하의 질문에 답하기 위해 몇 가지 변형을 사용하여 프로파일 링했습니다.

결론 : numpy 배열에서 다른 배열로 데이터를 복사하려면 내장 된 numpy 함수 중 하나를 사용 numpy.array(src)하거나 numpy.copyto(dst, src)가능할 때마다 사용하십시오.

(하지만 dst의 메모리가 이미 할당 된 경우 나중에 메모리를 다시 사용하려면 항상 나중에 선택 하십시오. 게시물 끝에있는 프로파일 링을 참조하십시오.)

프로파일 링 설정

import timeit
import numpy as np
import pandas as pd
from IPython.display import display

def profile_this(methods, setup='', niter=10 ** 4, p_globals=None, **kwargs):
    if p_globals is not None:
        print('globals: {0}, tested {1:.0e} times'.format(p_globals, niter))
    timings = np.array([timeit.timeit(method, setup=setup, number=niter,
                                      globals=p_globals, **kwargs) for 
                        method in methods])
    ranking = np.argsort(timings)
    timings = np.array(timings)[ranking]
    methods = np.array(methods)[ranking]
    speedups = np.amax(timings) / timings

    pd.set_option('html', False)
    data = {'time (s)': timings,
            'speedup': ['{:.2f}x'.format(s) if 1 != s else '' for s in speedups],
            'methods': methods}
    data_frame = pd.DataFrame(data, columns=['time (s)', 'speedup', 'methods'])

    display(data_frame)
    print()

프로파일 링 코드

setup = '''import numpy as np; x = np.random.random(n)'''
methods = (
    '''y = np.zeros(n, dtype=x.dtype); y[:] = x''',
    '''y = np.zeros_like(x); y[:] = x''',
    '''y = np.empty(n, dtype=x.dtype); y[:] = x''',
    '''y = np.empty_like(x); y[:] = x''',
    '''y = np.copy(x)''',
    '''y = x.astype(x.dtype)''',
    '''y = 1*x''',
    '''y = np.empty_like(x); np.copyto(y, x)''',
    '''y = np.empty_like(x); np.copyto(y, x, casting='no')''',
    '''y = np.empty(n)\nfor i in range(x.size):\n\ty[i] = x[i]'''
)

for n, it in ((2, 6), (3, 6), (3.8, 6), (4, 6), (5, 5), (6, 4.5)):
    profile_this(methods[:-1:] if n > 2 else methods, setup, 
                 niter=int(10 ** it), p_globals={'n': int(10 ** n)})

Intel i7 CPU, CPython v3.5.0, numpy v1.10.1의 Windows 7에 대한 결과 .

globals: {'n': 100}, tested 1e+06 times

     time (s) speedup                                            methods
0    0.386908  33.76x                                    y = np.array(x)
1    0.496475  26.31x                              y = x.astype(x.dtype)
2    0.567027  23.03x              y = np.empty_like(x); np.copyto(y, x)
3    0.666129  19.61x                     y = np.empty_like(x); y[:] = x
4    0.967086  13.51x                                            y = 1*x
5    1.067240  12.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
6    1.235198  10.57x                                     y = np.copy(x)
7    1.624535   8.04x           y = np.zeros(n, dtype=x.dtype); y[:] = x
8    1.626120   8.03x           y = np.empty(n, dtype=x.dtype); y[:] = x
9    3.569372   3.66x                     y = np.zeros_like(x); y[:] = x
10  13.061154          y = np.empty(n)\nfor i in range(x.size):\n\ty[...


globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                                            methods
0  0.666237   6.10x                              y = x.astype(x.dtype)
1  0.740594   5.49x              y = np.empty_like(x); np.copyto(y, x)
2  0.755246   5.39x                                    y = np.array(x)
3  1.043631   3.90x                     y = np.empty_like(x); y[:] = x
4  1.398793   2.91x                                            y = 1*x
5  1.434299   2.84x  y = np.empty_like(x); np.copyto(y, x, casting=...
6  1.544769   2.63x                                     y = np.copy(x)
7  1.873119   2.17x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  2.355593   1.73x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  4.067133                             y = np.zeros_like(x); y[:] = x


globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                                            methods
0  2.338428   3.05x                                    y = np.array(x)
1  2.466636   2.89x                              y = x.astype(x.dtype)
2  2.561535   2.78x              y = np.empty_like(x); np.copyto(y, x)
3  2.603601   2.74x                     y = np.empty_like(x); y[:] = x
4  3.005610   2.37x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  3.215863   2.22x                                     y = np.copy(x)
6  3.249763   2.19x                                            y = 1*x
7  3.661599   1.95x           y = np.empty(n, dtype=x.dtype); y[:] = x
8  6.344077   1.12x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  7.133050                             y = np.zeros_like(x); y[:] = x


globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                                            methods
0  3.421806   2.82x                                    y = np.array(x)
1  3.569501   2.71x                              y = x.astype(x.dtype)
2  3.618747   2.67x              y = np.empty_like(x); np.copyto(y, x)
3  3.708604   2.61x                     y = np.empty_like(x); y[:] = x
4  4.150505   2.33x  y = np.empty_like(x); np.copyto(y, x, casting=...
5  4.402126   2.19x                                     y = np.copy(x)
6  4.917966   1.96x           y = np.empty(n, dtype=x.dtype); y[:] = x
7  4.941269   1.96x                                            y = 1*x
8  8.925884   1.08x           y = np.zeros(n, dtype=x.dtype); y[:] = x
9  9.661437                             y = np.zeros_like(x); y[:] = x


globals: {'n': 100000}, tested 1e+05 times

    time (s) speedup                                            methods
0   3.858588   2.63x                              y = x.astype(x.dtype)
1   3.873989   2.62x                                    y = np.array(x)
2   3.896584   2.60x              y = np.empty_like(x); np.copyto(y, x)
3   3.919729   2.58x  y = np.empty_like(x); np.copyto(y, x, casting=...
4   3.948563   2.57x                     y = np.empty_like(x); y[:] = x
5   4.000521   2.53x                                     y = np.copy(x)
6   4.087255   2.48x           y = np.empty(n, dtype=x.dtype); y[:] = x
7   4.803606   2.11x                                            y = 1*x
8   6.723291   1.51x                     y = np.zeros_like(x); y[:] = x
9  10.131983                   y = np.zeros(n, dtype=x.dtype); y[:] = x


globals: {'n': 1000000}, tested 3e+04 times

     time (s) speedup                                            methods
0   85.625484   1.24x                     y = np.empty_like(x); y[:] = x
1   85.693316   1.24x              y = np.empty_like(x); np.copyto(y, x)
2   85.790064   1.24x  y = np.empty_like(x); np.copyto(y, x, casting=...
3   86.342230   1.23x           y = np.empty(n, dtype=x.dtype); y[:] = x
4   86.954862   1.22x           y = np.zeros(n, dtype=x.dtype); y[:] = x
5   89.503368   1.18x                                    y = np.array(x)
6   91.986177   1.15x                                            y = 1*x
7   95.216021   1.11x                                     y = np.copy(x)
8  100.524358   1.05x                              y = x.astype(x.dtype)
9  106.045746                             y = np.zeros_like(x); y[:] = x


Also, see results for a variant of the profiling where the destination's memory is already pre-allocated during value copying, since y = np.empty_like(x) is part of the setup:

globals: {'n': 100}, tested 1e+06 times

   time (s) speedup                        methods
0  0.328492   2.33x                np.copyto(y, x)
1  0.384043   1.99x                y = np.array(x)
2  0.405529   1.89x                       y[:] = x
3  0.764625          np.copyto(y, x, casting='no')


globals: {'n': 1000}, tested 1e+06 times

   time (s) speedup                        methods
0  0.453094   1.95x                np.copyto(y, x)
1  0.537594   1.64x                       y[:] = x
2  0.770695   1.15x                y = np.array(x)
3  0.884261          np.copyto(y, x, casting='no')


globals: {'n': 6309}, tested 1e+06 times

   time (s) speedup                        methods
0  2.125426   1.20x                np.copyto(y, x)
1  2.182111   1.17x                       y[:] = x
2  2.364018   1.08x                y = np.array(x)
3  2.553323          np.copyto(y, x, casting='no')


globals: {'n': 10000}, tested 1e+06 times

   time (s) speedup                        methods
0  3.196402   1.13x                np.copyto(y, x)
1  3.523396   1.02x                       y[:] = x
2  3.531007   1.02x                y = np.array(x)
3  3.597598          np.copyto(y, x, casting='no')


globals: {'n': 100000}, tested 1e+05 times

   time (s) speedup                        methods
0  3.862123   1.01x                np.copyto(y, x)
1  3.863693   1.01x                y = np.array(x)
2  3.873194   1.01x                       y[:] = x
3  3.909018          np.copyto(y, x, casting='no')


you can easy use:

b = 1*a

this is the fastest way, but also have some problems. If you don't define directly the dtype of a and also doesn't check the dtype of b you can get into trouble. For example:

a = np.arange(10)        # dtype = int64
b = 1*a                  # dtype = int64

a = np.arange(10.)       # dtype = float64
b = 1*a                  # dtype = float64

a = np.arange(10)        # dtype = int64
b = 1. * a               # dtype = float64

I hope, I could make the point clear. Sometimes you will have a data type change with just one little operation.


There are many different things you can do:

a=np.copy(b)
a=np.array(b) # Does exactly the same as np.copy
a[:]=b # a needs to be preallocated
a=b[np.arange(b.shape[0])]
a=copy.deepcopy(b)

Things that don't work

a=b
a=b[:] # This have given my code bugs 

Why not to use

a = 0 + b

I think it is similar to previous multiplication but might be simpler.

참고URL : https://stackoverflow.com/questions/6431973/how-to-copy-data-from-a-numpy-array-to-another

반응형