[Python][pandas] Sorting Data

개발 Code/파이썬 Python

[Python][pandas] Sorting Data - sort

5hr1rnp 2025. 2. 13. 21:27

This guide covers various methods for sorting data in Pandas, including the primary sorting functions sort_values() and sort_index(), as well as nlargest(), nsmallest(), reindex(), and the use of the key parameter in sort_values().

1. Sorting with sort_values()

The sort_values() method sorts a DataFrame based on column values. It is the most commonly used sorting function.

import pandas as pd

df = pd.DataFrame({
    'A': [3, 1, 2, 4],
    'B': [10, 50, 20, 20],
    'C': ['b', 'a', 'b', 'c']
})

# 1) Sorting by a single column
df_sorted_single = df.sort_values(by='A')
print(df_sorted_single)

#    A   B  C
# 1  1  50  a
# 2  2  20  b
# 0  3  10  b
# 3  4  20  c

# 2) Sorting by multiple columns
df_sorted_multi = df.sort_values(by=['B', 'A'], ascending=[True, False])
print(df_sorted_multi)

#    A   B  C
# 0  3  10  b
# 3  4  20  c
# 2  2  20  b
# 1  1  50  a

Key Parameters

by: Column name(s) to sort by.
ascending: True for ascending order, False for descending order. Can be a list for multiple columns.
inplace: True modifies the original DataFrame.
na_position: 'first' or 'last' (default) to specify where NaN values should appear.
key: (Pandas 1.1.0+) Applies a function to the column before sorting.

Example: Sorting Without Case Sensitivity

df.sort_values(by='C', key=lambda col: col.str.lower())

2. Sorting with sort_index()

The sort_index() method sorts a DataFrame based on index labels (row or column names).

df = pd.DataFrame({
    'A': [3, 1, 2],
    'B': [2, 3, 1]
}, index=['c', 'a', 'b'])

# Sorting by index (ascending)
df_sorted_index = df.sort_index()
print(df_sorted_index)

#    A  B
# a  1  3
# b  2  1
# c  3  2

# Sorting by index (descending)
df_sorted_index_desc = df.sort_index(ascending=False)
print(df_sorted_index_desc)

#    A  B
# c  3  2
# b  2  1
# a  1  3

# Sorting by column names
df_sorted_columns = df.sort_index(axis=1)
print(df_sorted_columns)

#    A  B
# c  3  2
# a  1  3
# b  2  1

Key Parameters

axis: 0 (default) for row index, 1 for column names.
ascending: True (default) for ascending order, False for descending.
inplace: True modifies the original DataFrame.

728x90

3. Selecting Top or Bottom n Values: nlargest() and nsmallest()

The nlargest() and nsmallest() methods extract the largest or smallest n values from a specified column. These methods are optimized for efficiency.

# Selecting top 3 values from column 'B'
df_top3 = df.nlargest(3, 'B')

#    A  B
# a  1  3
# c  3  2
# b  2  1

# Selecting bottom 3 values from column 'B'
df_bottom3 = df.nsmallest(3, 'B')

#    A  B
# b  2  1
# c  3  2
# a  1  3

4. Reindexing with reindex()

The reindex() method rearranges rows or columns based on a provided index list. While not strictly a sorting method, it can be used to reorder a DataFrame.

# Sorting index manually
sorted_idx = sorted(df.index)
df_reindexed = df.reindex(sorted_idx)
print(df_reindexed)

#    A  B
# a  1  3
# b  2  1
# c  3  2

5. Sorting MultiIndex DataFrames

For MultiIndex DataFrames, sorting is done using the level parameter.

df_multi = df.sort_index(level=[0, 1], ascending=[True, False])

Summary of Sorting Methods

Method	Description
sort_values()	Sorts by column values (ascending/descending, multiple columns, custom sorting with key).
sort_index()	Sorts by row or column labels.
nlargest() / nsmallest()	Quickly extracts the top/bottom n rows from a specific column.
reindex()	Reorders rows/columns based on a given index list.
MultiIndex Sorting	Uses sort_index(level=...) for fine-grained control over MultiIndex sorting.

Most sorting tasks can be handled with sort_values() and sort_index(), while nlargest() and nsmallest() are useful for quickly extracting top or bottom values. The key parameter and MultiIndex sorting provide additional flexibility when needed.

저작자표시 비영리 변경금지

'개발 Code > 파이썬 Python' 카테고리의 다른 글

[Python][pandas] DataFrame 행별 순회(iterate) 방법 정리 (0)	2025.02.24
[Python][pandas] Parquet 파일 포맷: 고속 데이터 처리에 최적화된 컬럼 저장 방식 (0)	2025.02.19
[Python][pandas] Loading Data - Excel (0)	2025.02.13
[Python][program] CLI ASCII art 발렌타인 메세지 쓰기 (0)	2025.02.12
[Python][pandas] Loading Data - CSV (0)	2025.02.11

현재글[Python][pandas] Sorting Data - sort

🐶짱구와 꾜미 집에 놀러온 용뇽이🦊

일상 속에서 발견한 작은 언어의 재미, 스쳐 지나간 풍경과 맛있는 기억들, 그리고 배움 속에서 얻은 깨달음을 나누는 공간. A place to share the joy of language, fleeting landscapes and delightful flavors, and the insights gained through learning.

250x250

🐶짱구와 꾜미 집에 놀러온 용뇽이🦊