Program Tip

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

programtip 2020. 11. 6. 19:04

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

다음 열 이름을 가진 pandas 데이터 프레임이 있습니다.

Result1, Test1, Result2, Test2, Result3, Test3 등 ...

이름에 "Test"라는 단어가 포함 된 모든 열을 삭제하고 싶습니다. 이러한 열의 수는 정적이 아니지만 이전 함수에 따라 다릅니다.

어떻게 할 수 있습니까?

import pandas as pd

import numpy as np

array=np.random.random((2,4))

df=pd.DataFrame(array, columns=('Test1', 'toto', 'test2', 'riri'))

print df

      Test1      toto     test2      riri
0  0.923249  0.572528  0.845464  0.144891
1  0.020438  0.332540  0.144455  0.741412

cols = [c for c in df.columns if c.lower()[:4] != 'test']

df=df[cols]

print df
       toto      riri
0  0.572528  0.144891
1  0.332540  0.741412

여기에 좋은 방법이 있습니다.

df = df[df.columns.drop(list(df.filter(regex='Test')))]

저렴하고 빠르며 관용적 : `str.contains`

최신 버전의 Pandas에서는 인덱스 및 열에 문자열 메서드를 사용할 수 있습니다. 여기, str.startswith잘 맞는 것 같습니다.

주어진 하위 문자열로 시작하는 모든 열을 제거하려면 :

df.columns.str.startswith('Test')
# array([ True, False, False, False])

df.loc[:,~df.columns.str.startswith('Test')]

  toto test2 riri
0    x     x    x
1    x     x    x

대소 문자를 구분하지 않는 일치의 경우 str.containsSOL 앵커 와 함께 정규식 기반 일치를 사용할 수 있습니다 .

df.columns.str.contains('^test', case=False)
# array([ True, False,  True, False])

df.loc[:,~df.columns.str.contains('^test', case=False)] 

  toto riri
0    x    x
1    x    x

혼합 유형이 가능한 경우도 지정하십시오 na=False.

'필터'를 사용하여 원하는 열을 필터링 할 수 있습니다.

import pandas as pd
import numpy as np

data2 = [{'test2': 1, 'result1': 2}, {'test': 5, 'result34': 10, 'c': 20}]

df = pd.DataFrame(data2)

df

    c   result1     result34    test    test2
0   NaN     2.0     NaN     NaN     1.0
1   20.0    NaN     10.0    5.0     NaN

Now filter

df.filter(like='result',axis=1)

Get..

   result1  result34
0   2.0     NaN
1   NaN     10.0

Use the DataFrame.select method:

In [38]: df = DataFrame({'Test1': randn(10), 'Test2': randn(10), 'awesome': randn(10)})

In [39]: df.select(lambda x: not re.search('Test\d+', x), axis=1)
Out[39]:
   awesome
0    1.215
1    1.247
2    0.142
3    0.169
4    0.137
5   -0.971
6    0.736
7    0.214
8    0.111
9   -0.214

This can be done neatly in one line with:

df = df.drop(df.filter(regex='Test').columns, axis=1)

참고URL : https://stackoverflow.com/questions/19071199/drop-columns-whose-name-contains-a-specific-string-from-pandas-dataframe

'Program Tip' 카테고리의 다른 글

"width : -moz-fit-content;"에 대한 CSS 교차 브라우저 값이 있습니까? (0)	2020.11.06
jQuery로 HTML 태그를 제거하는 방법은 무엇입니까? (0)	2020.11.06
URL 쿼리에 대한 NameValueCollection? (0)	2020.11.06
Python 목록 반복을위한 시작 색인 (0)	2020.11.06
JavaFX 2.0에서 공통 대화 상자 (오류, 경고, 확인)를 만들고 표시하는 방법은 무엇입니까? (0)	2020.11.06

현재글이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

programtip

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

저렴하고 빠르며 관용적 : `str.contains`

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

티스토리툴바

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

이름에 Pandas DataFrame의 특정 문자열이 포함 된 열을 삭제합니다.

저렴하고 빠르며 관용적 : str.contains

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

관련글

티스토리툴바

저렴하고 빠르며 관용적 : `str.contains`