This is my first project on Data Science, Data Analysis and Visualization...!!
Aim of the Project:
What Makes instant Noodles so popular globally, and what is the Characteristic feature of one Brand or Variety of Ramen compared to the other.
Libraries and packages:
This dataset is an export of “The Big List” (of reviews) from the Ramen Rater website and converted to a CSV format.
Columns in this dataset include Brand, Variety (the product name), Country, and Style (Style the Ramen is served). Stars indicate the ramen quality, as assessed by the reviewer, on a 5-point scale
The Data Contains the following Columns:
Checking for Unique Values:
1. It appears that styles: ‘Can’, ‘Bar’ and ‘Box’ are outliers — very few stars and occurrences.
2. The dataset contains Brand/Variety that is not Ramen: Example: Brand = ‘Pringles’, Variety = Nissin Top Ramen Chicken Flavor Potato Crisp.
3. Discovering Outliers:
3. It appears that there are some brands that have been rated 0.0 which is invalid.
4. Removing Outliers:
a) Deleting the outlier style — ‘Box’, ‘Can’, ‘Bar’
b) Deleting the rating = 0.0 as it is an invalid value.
Data Analysis and Inference:
The more reviews a brand has, the more people have tried it.
We won’t classify a brand as good based on the number of stars since any brand can easily make the list with just one 5* product.
Top 10 Best Brands:
Rating Distribution by Country:
Preference by syle of Ramen Served:
Country vs Style:
Brand vs Country vs Stars:
Country, Brand and mean avg rating for each country:
Country, Style, and Number of times each country reviewed a style:
Group for Country, Style and mean avg rating for each country:
Distribution of Star Ratings:
Top 100 Ramen Product Strings:
Arvind, Shamia, Ullas, Prabhu