Pandas 판다스: 판다스 기초 설명, 파일 불러오기

데이터 분석 라이브러리
행과 열로 이루어진 이 차원 데이터를 효율적으로 가공할 수 있는 다양한 기능을 제공
설치 방법: pip install pandas
사용방법: import pandas as pd
웹사이트: pandas.pydata.org

데이터

Kaggle(캐글)에서 titanic데이터 사용(https://www.kaggle.com/c/titanic)
데이터 분석 경진대회를 주최하는 플랫폼
회사의 과제 연구 주요 서비스를 위해 분석에 필요한 데이터를 제공해서 주최
대기업 경력직을 채용할 때 면접 문제로 사용됨
문제의 목표: 타이타닉에서 살아남을 수 있는 승객을 예측하기

pandas 파일 불러오기

다양한 외부 형태의 파일을 읽어와서 데이터 프레임으로 변환하는 함수를 제공

file format	reader	writer
CSV	read_csv(파일경로) - index_col : 인덱스로 사용할 컬럼 지정 - usecols : 사용할 컬럼 지정	to_csv(파일경로, encoding='utf-8-sig')
Excel	read_excel - header: 컬럼이름으로 사용할 - index_col : 인덱스로 사용할 컬럼 지정 - usecols : 사용할 컬럼 지정	to_excel
JSON	read_json	to_json
SQL	read_sql	to_sql
HTML	read_html - encoding: 한글이 깨져서 나올 때 utf-8/cp949로 설정	to_html

titanic = pd.read_csv("/content/titanic.csv") #임시파일

colab에 파일 올리기하면 임시파일로 올려짐(런타임 오류나면 파일 없어짐)

계속 유지하려면 드라이브 마운트 기능을 통해 실행 필요

#read_csv example: index_col, usecols 
csv_data1 = pd.read_csv(csv_file_path, index_col = 0)
csv_data1 = pd.read_csv(csv_file_path, index_col = 'PassengerId', usecols = ['PassengerId','Survived','Pclass','Age'])

#read_excel example: header, index_col, usecols 
excel_data1 = pd.read_excel(excel_file_path, sheet_name='시트1', header=1, index_col = 'PassengerId', usecols = ['PassengerId','Survived','Pclass','Age'])

#read_html example: encoding, reset_index
quant_data_list = pd.read_html(html_path, encoding='cp949')
kospi.dropna(how='all').reset_index(drop=True)

'Python_Wiki > Python_Library' 카테고리의 다른 글

matplotlib/seaborn: heatmap 히트맵 (0)	2025.06.06
matplotlib: 꺾은선 그래프 line chart(fill_between) (0)	2025.06.05
matplotlib/seaborn: bar chart 막대 그래프 (0)	2025.06.05
matplotlib: pie chart 파이차트 (0)	2025.06.05
Pandas EDA 기초 및 함수 (0)	2025.05.20

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

데이터 분석가 June

Pandas 판다스: 판다스 기초 설명, 파일 불러오기

'Python_Wiki > Python_Library' 카테고리의 다른 글

티스토리툴바

Pandas 판다스: 판다스 기초 설명, 파일 불러오기

'Python_Wiki > Python_Library' 카테고리의 다른 글

관련글

티스토리툴바