

Data wrangling (or data munging) data ko involve karta h cleaning and transforming raw data ko convert karnke format ko analyse karta hai. It includes various processes:
1. Data Cleaning:
Handling Missing Values: Techniques include imputation (mean, median, mode), removal of missing entries, or is algorithms ka use karke missing data handle kiya jata hai.
Removing Duplicates: Identifying and eliminating duplicate records to ensure data integrity.
2. Data Transformation:
Normalization: Adjusting values to a common scale.
Encoding Categorical Variables: Converting categorical data into numerical format using techniques like one-hot encoding or label encoding.
3. Feature Engineering:
Creating new features or puraane features ka use karke better improve model performance, such as combining date and time into a single feature or extracting domain-specific metrics.
4. Data Integration:
Combining data from multiple sources to create a unified dataset, jisme merging data frames or databases involve hota h
5. Outlier Detection and Treatment:
Identifying and Decide ki kaise handle kar sakte h outliers, jisme involve ho sake removal, transformation, or capping.
6. Reshaping Data:
pivot tables, melting, or stacking ka use karke format change kiya jata hai dataset ke liye taki better analysis or visualization ho sake .
Tools and Libraries
Pandas: A powerful Python library for data manipulation and analysis, offering functions for scaling, cleaning, and wrangling.
NumPy: Useful for numerical operations and handling arrays.