여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

Program Tip

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

programtip 2020. 12. 4. 20:22

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

다음 데이터 프레임이 있습니다.

df = pandas.DataFrame([{'c1':3,'c2':10},{'c1':2, 'c2':30},{'c1':1,'c2':20},{'c1':2,'c2':15},{'c1':2,'c2':100}])

또는 사람이 읽을 수있는 형식으로 :

다음 정렬 명령은 예상대로 작동합니다.

df.sort(['c1','c2'], ascending=False)

산출:

그러나 다음 명령 :

df.sort(['c1','c2'], ascending=[False,True])

결과

그리고 이것은 내가 기대하는 것이 아닙니다. 첫 번째 열의 값이 가장 큰 값에서 가장 작은 값으로 정렬 될 것으로 예상하고 첫 번째 열에 동일한 값이있는 경우 두 번째 열의 오름차순 값으로 정렬합니다.

예상대로 작동하지 않는 이유를 아는 사람이 있습니까?

추가됨

이것은 복사-붙여 넣기입니다.

>>> df.sort(['c1','c2'], ascending=[False,True])
   c1   c2
2   1   20
3   2   15
1   2   30
4   2  100
0   3   10

DataFrame.sort더 이상 사용되지 않습니다. 를 사용하십시오 DataFrame.sort_values.

>>> df.sort_values(['c1','c2'], ascending=[False,True])
   c1   c2
0   3   10
3   2   15
1   2   30
4   2  100
2   1   20
>>> df.sort(['c1','c2'], ascending=[False,True])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ampawake/anaconda/envs/pseudo/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'sort'

를 사용하면 sort경고 메시지가 표시 될 수 있습니다. github 토론을 참조하십시오 . 따라서 여기에서sort_values , 문서를 사용하고 싶을 수도 있습니다.

그러면 코드는 다음과 같이 보일 수 있습니다.

df = df.sort_values(by=['c1','c2'], ascending=[False,True])

The dataframe.sort() method is - so my understanding - deprecated in pandas > 0.18. In order to solve your problem you should use dataframe.sort_values() instead:

f.sort_values(by=["c1","c2"], ascending=[False, True])

The output looks like this:

In my case, the accepted answer didn't work:

~~f.sort_values(by=["c1","c2"], ascending=[False, True])~~

Only the following worked as expected:

f = f.sort_values(by=["c1","c2"], ascending=[False, True])

If you are writing this code as a script file then you will have to write it like this:

df = df.sort(['c1','c2'], ascending=[False,True])

I have found this to be really useful:

df = pd.DataFrame({'A' : range(0,10) * 2, 'B' : np.random.randint(20,30,20)})

# A ascending, B descending
df.sort(**skw(columns=['A','-B']))

# A descending, B ascending
df.sort(**skw(columns=['-A','+B']))

Note that unlike the standard columns=,ascending= arguments, here column names and their sort order are in the same place. As a result your code gets a lot easier to read and maintain.

Note the actual call to .sort is unchanged, skw (sortkwargs) is just a small helper function that parses the columns and returns the usual columns= and ascending= parameters for you. Pass it any other sort kwargs as you usually would. Copy/paste the following code into e.g. your local utils.py then forget about it and just use it as above.

# utils.py (or anywhere else convenient to import)
def skw(columns=None, **kwargs):
    """ get sort kwargs by parsing sort order given in column name """
    # set default order as ascending (+)
    sort_cols = ['+' + col if col[0] != '-' else col for col in columns]
    # get sort kwargs
    columns, ascending = zip(*[(col.replace('+', '').replace('-', ''), 
                                False if col[0] == '-' else True) 
                               for col in sort_cols])
    kwargs.update(dict(columns=list(columns), ascending=ascending))
    return kwargs

Note : Everything up here is correct,just replace sort --> sort_values() So, it becomes:

 import pandas as pd
 df = pd.read_csv('data.csv')
 df.sort_values(ascending=False,inplace=True)

Refer to the official website here.

참고URL : https://stackoverflow.com/questions/17618981/how-to-sort-pandas-data-frame-using-values-from-several-columns

'Program Tip' 카테고리의 다른 글

servletcontext.getRealPath ( "/")는 무엇을 의미하며 언제 사용해야합니까? (0)	2020.12.04
Lisp의 read-eval-print 루프는 Python과 어떻게 다릅니 까? (0)	2020.12.04
특정 브랜치에서 저장소를 얕게 복제하려면 어떻게해야합니까? (0)	2020.12.04
MySQL 오류 # 1064를 어떻게 수정할 수 있습니까? (0)	2020.12.04
Jenkins의 다중 분기 파이프 라인으로 "주기적으로 구축" (0)	2020.12.04

현재글여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

programtip

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

티스토리툴바

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

여러 열의 값을 사용하여 Pandas 데이터 프레임을 정렬하는 방법은 무엇입니까?

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

관련글

티스토리툴바