Proof of Concepts Showcase

Welcome to our Proof of Concepts showcase, where innovation meets actionable insights. At Intelcraft Infotech, we leverage cutting-edge tools like Power BI, Tableau Desktop, Python and Google Looker to transform data into compelling stories that drive business success. From dynamic dashboards that offer instant data visibility to intricate data models that provide deep analytical insights and predictive machine learning models, our PoC projects demonstrate our commitment to delivering customized, high-impact solutions. Please explore our PoCs to see how we can turn your data challenges into strategic advantages, ensuring your decision-making is informed, agile, and ahead of the curve.

Please note: Only dummy datasets are used in all these PoC projects. After seeing our PoCs, if any future potential clients or companies are interested in partnering with us to develop BI & Data based solutions for them, or any students or companies(for their staffs) are interested to purchase & enrol for our Training Courses, please call (or WhatsApp) us on +61-470298065/+91-9342828085 or email us on info@intelcraftinfotech.com, to discuss about it further.

Power BI Dashboard Development Project: 1

The below Power BI dashboard(Proof of Concept) contains pages with Preliminary & Final Sales figures for last 4 months and Sales forecast figures for next 3 months. The dashboard is completely automated and it will automatically fetch last 4 months Sales data and next 3 months Sales forecast data, when the month changes, without the need for any manual intervention. This dashboard also comes with toggle buttons to switch between preliminary sales, final sales and sales forecast figures.

Power BI Dashboard Development Project: 2

The below Power BI dashboard(Proof of Concept) contains pages regarding MOBILE SALES ON A E-COMMERCE WEBSITE.

Power BI Dashboard Development Project: 3

The below Power BI dashboard(Proof of Concept) contains details regarding MOBILE USAGE AND OTHER DETAILS OF CUSTOMERS IN A PREPAID SIMCARD COMPANY.

Power BI Dashboard Development Project: 4

The below Power BI dashboard(Proof of Concept) contains details regarding SUPERMARKET SALES ANALYSIS.

Power BI Dashboard Development Project: 5

The below Power BI dashboard(Proof of Concept) contains details regarding WORLD POPULATION ANALYSIS.

Power BI Dashboard Development Project: 6

The below Power BI dashboard(Proof of Concept) contains details regarding an EDTECH COMPANY’S COURSE ANALYSIS

Power BI Dashboard Development Project: 7

The below Power BI dashboard(Proof of Concept) contains details regarding HEALTHCARE SECTOR ANALYSIS

Tableau Dashboard Development Project: 1

This ‘.twbx’ file contains a sample Tableau dashboard for an imaginary Telecom company and Dummy data is used to create this dashboard. Parameters, Filters, Calculations, Actions and many other features of Tableau are used to create this dashboard. I have added some customized charts like Pie chart and side by side bar chart with arrow marks in this dashboard.

 

Google Looker Dashboard Development Project: 1

The below Google Looker dashboard(Proof of Concept) contains details regarding MOBILE USAGE DETAILS OF TOP 10 CUSTOMERS IN A PREPAID SIMCARD COMPANY.

Python Data Science Project: 1

                              MODEL FOR DIABETES PREDICTION USING RANDOM FOREST CLASSIFIER MACHINE LEARNING ALGORITHM

Aim

Our aim is to predict whether people have diabetes or not, based on the input values, the users provide. Here the model is trained with the dataset which provides the vitals information. Based on the new input data, the model will predict the output for the new person.

 

Dataset analysis

The dataset can be found in the below link

Diabetes Prediction Dataset CSV file

This dataset contains diabetes status for a set of people based on the following factors.

  • Gender
  • Age: Age of the person
  • Hypertension: If a person have high BP its 1 otherwise it is 0.
  • Heart_disease: If a person have heart diseases it is 1 otherwise it is 0.
  • Smoking_history: This gives details regarding whether the person is a current smoker or former smoker or non smoker etc.
  • BMI: It is the Body Mass Index of the person.
  • HbA1c_level: It is the Random Sugar Level which varies between 0 and 10.
  • Blood_glucose_level: Blood Sugar level.
  • Diabetes: Based on all the above information, it is the prediction whether the person have diabetes or not(target variable).

 

  1. Finding the gender distribution:  

 

          Fig 1: Gender Distribution

The above figure shows the number of people who are male and female. Here 41.6% of people is female and 58.4% of people is male and 0.0187% of people is others.

 

   2. Diabetes distribution using HbA1c_level:

                       Fig 2:Distribution based on HbA1c_level

Here the diabetes level-HbA1c_level ranges from 0 to 10. So, by the analysis the HbA1c_level ranges from 0-5.7 is considered as non-diabetic for all range of people in all ages. The  HbA1c_level ranges from 5.7 to 10 is considered as diabetic but have to consider the other features like age, health condition and their smoking histories. This feature plays a vital role in the prediction.

 

   3. Analysis based on Smoking History:

 

                Fig 3: Distribution based on Smoking History

Based on the smoking history the distribution of the dataset is considered here. People are separated as no smokers, not smokers now, current smokers, former smokers and no information on smoking. Here in every segment people of some percentage is having diabetes and maximum are not. So this feature does not affect the output highly.

 

   4. Diabetes distribution based on Age:

                            Fig 4: Diabetes distribution based on Age

Here the people are segregated by their age and whether they have diabetes or not. The color legend indicates the HbA1c- level of the people. Here the people in all ages are having diabetes from the range of 0 to 100.

 

   5. Diabetes distribution based on BMI and AGE:

                          Fig 5: Diabetes distribution based on BMI and AGE

Here few people have diabetes in every range of BMI. But the number of people who have diabetes increased by age and BMI.

 

   6. Distribution of features relationship with each other:

             Fig 6: Distribution of features relationship with each other

Here the scatter plot explains the relationship of each feature with one another. So here we can analyze the relationship and dependency of each feature with other features.

 

   7. Heatmap to find the Dependent variables:

                      Fig 7: Heatmap to find the Correlation between Variables

This heatmap is used to find the dependent variables which affects the target variable.

If the value is near to 1, those 2 variables are positively correlated. 

If the value is near to -1, those 2 variables are negatively correlated.

If the value is near to 0, there is no much correlation between those 2 features.

From this heatmap we can find that the features affecting the Diabetes variable are HbAIc_level, blood_glucose_level, followed by age and bmi

 

   8. Model Selection for Diabetes Prediction:

After completing all the analysis the next step is to build the model using the suitable Machine Learning algorithm. Naives Bayes, Decision tree classifier, Random forest Classifier, Gradient Booster and XG Booster Predictive Models are tested in this dataset. By considering the accuracy, f1 score, precision value and recall values for the above mentioned predictive models, it is decided that Random forest Classifier predictive model will be used for this dataset for the prediction of diabetes.

The model build with Random Forest Classifier predictive model has

Accuracy : 96.99

Precision : 94.98

Recall : 69.00

f1 score : 79.93

The model is robust and not an overfitted or underfitted model. Now it is ready for prediction. The model gets the values for different features and the output is predicted from that.

 

Sample Output:

The final output value, generated using this model with given input values are as shown in the below screenshot, and it is

You DONT HAVE diabetes‘ 

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done on exploratory data analysis, data preparation, model selection, feature selection, training and testing of different models.

 

Python Data Science Project: 2

                      CUSTOMER SEGMENTATION USING KMEANS MACHINE LEARNING ALGORITHM

Aim

The aim of this project is to segment the customers of service sectors like bank, mall, companies etc. This is helpful to categorize the customers and to find the potential customers.

 

Dataset analysis

The dataset can be found in the below link

bank_transactions.csv

This dataset contains bank’s customer transaction details. The fields are given below

  • TransactionID                    : An unique identifier generated for each Transaction.
  • CustomerID                        : An unique Identifier for each customer of the Bank.
  • CustomerDOB                    : Date of Birth of the Customer.
  • CustGender                        : It provides information about Customer’s Gender.
  • CustLocation                     : It gives details of the customer’s Location.
  • CustAccountBalance       : It gives customer’s account balance details.
  • TransactionDate               : It provides information about the date that Transaction was done.
  • TransactionTime              : It gives information about the number of times the Transaction has occurred.
  • TransactionAmount(INR): It provides information about the amount transacted from that particular account.

 

  1. Data Analysis based on Dataset

    1.1. Analysis based on Gender Distribution

The below pie chart is done for analyzing the gender distribution. From the chart, we can find that the bank has more male customers than female customers.

                Fig 1: Pie- Chart on Gender Distribution

 

    1.2. Analysis based on Customer’s Location

This chart gives a clear picture of which location has maximum customers. The below chart shows the top 25 locations that have maximum number of customers. Mumbai is in top with most number of customers.

                  Location

                                  Fig 2: Bar Chart on Customer’s Location

 

    1.3. Analysis based on Age of Customers

In the below chart, the age of the bank’s Customers is analyzed. From the chart we can find that, maximum number of customers are between the age range 20 to 30

              Fig 3: Distribution on Age of the Customers

 

    1.4. Analysis based on Transaction Amount

Analyzing the transaction amount, gives knowledge about which amounts are transacted frequently or maximum times. From the below chart, we can find that amount between 300k and 400k is transacted maximum times.

               Fig 4: Distribution based on Transaction Amount

 

    1.5. Analysis based on Account Balance

This chart gives information about customer’s account balance amount and its count value. The account balance amount of above 400k has the most count.

                       Fig 5: Count of Individual Account Balance Amount

 

    1.6. Analysis based on Gender and Customer Location

The chart below provides information about the gender and the location of the customers.

Fig 6: Chart based on Gender and Location

 

    1.7. Analysis based on Age and Transaction amount

This chart helps to understand the relation between the transaction amount and the age of the customers. From the below chart, we can find that people with age above 25 and below 45 has high transaction amounts.

Fig 7: Scatter plot between Transaction amount and Customer’s Age

 

    1.8. Heatmap of Features in the Dataset

The below chart shows the relationship between the features. Most of the correlation values are less than 0.1.

This is a Unlabeled dataset. Customers can be segmented using Clustering.                             Fig 8: Heatmap of features in the Dataset

 

    2.1. Model

Customers can be segmented using Clustering technique. This will group the customers based on the similarities between each other. Below curve is used to find the number of clusters by finding the sharp bend in the curve.

                  Fig 9: Chart between Number of Clusters and Sum of Squares

 

    2.2. Scatter Matrix Plot based on various Features

The matrix plot is plotted for the features Cluster, Customer Gender, Customer location, Customer Age, Transaction Amount and Customer Account Balance. This plot is used to find which are correlated with the cluster.

                    Fig 10: Matrix plot based on Different Features

 

    2.3. Relationship between Customer Age and Cluster

From the below chart, we can find that the customer age is equally distributed in all clusters. So the Cluster is not segmented based on Customer’s Age.

Fig 11: Relationship between Customer Age and Cluster

 

    2.4. Relationship between Cluster and Customer Location

From the below chart, we can see that the Customer location is also equally distributed in all the clusters. So, the Cluster is not segmented based on the Customer location.

Fig 12 : Relationship between Customer Location and Cluster

 

    2.5. Relationship between Cluster and Customer Account Balance

From the below chart it is clear that the clusters are grouped by the account balance of the customers. We can see that there are different clusters with different ranges of account balance.

Fig 13: Relationship between Cluster and Account Balance

 

    2.6. Relationship Between Clusters and number of Transactions

The Below chart gives the relationship between Clusters and the number of transactions. The number of transactions are also distributed equally in all clusters. So the cluster segmentation is not based on  the number of transactions.

Fig 14: Relationship between Clusters and No. of. Transactions

 

    2.7. Customer Segmentation

The customers are segmented based on their Account Balance. The number of Customers in each clusters are given below.

Fig 15: Clusters and the Customers Count

 The Cluster 4 and Cluster 10 have less customers in the Customer Segmentation. These cluster’s data frame are given below.

Fig 16: Segmented Dataframe in Cluster 4

Fig 17: Segmented Dataframe in Cluster 10

 

    2.8. Result

The Customers in bank are segmented successfully, which will motivate the bank to conduct further research on identifying potential customers who engage in frequent and higher-volume transactions.

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done on exploratory data analysis, data preparation, model selection, feature selection, training and testing of the model.

 

Python Data Science Project: 3

                                  ONLINE MOBILE SALES ANALYSIS AND SALES QUANTITY PREDICTION USING MACHINE LEARNING ALGORITHM

Aim:

The objective is to address a hypothetical business problem for an Online Mobile Seller. The seller is looking to sell mobile phones Online. For this, the individual is looking for the best product, brand, specification and deals, that can generate the most revenue with the least amount of investment and budget constraint.

 

Dataset Analysis:

The dataset can be found in the below link

Online Mobile Sales.csv

This dataset contains mobile sales data for an Online seller, along with the features of the mobiles sold

 

Please note: The ‘sales’ column in this dataset which represents the quantity of mobiles sold is in hundreds. We can get the original number of mobiles sold by multiplying the ‘sales’ column by 100.

 

    1. Analyzing the Mobiles Dataset based on their features:

    1.1. Based on Brand:

Fig 1.1: No. of Mobile Models by Brand

                 The above table shows the number of mobile models by brand, in descending order. From the table, we can observe that Realme brand has more mobiles models whereas Poco and Apple brands has less.

 

    1.2. Based on Color:

Fig 1.2: No. of Mobile Models by Color

                   Blue color is used in most mobile models followed by black color. Bronze color is used in less mobile models.

 

    1.3. Based on Processor:

Fig 1.3: No. of Mobile Models by Processor

       The Qualcomm processor is used in most mobile models followed by Media Tek processor.

 

   1.4. Based on Screen Size:

Fig 1.4: No. of Mobile Models by Screen Size

Large screen size is used in most mobile models, followed by medium screen size. Small or very small or very large screen size are used in less number of mobile models.

 

    1.5 . Based on RAM used:

Fig 1.5: No. of Mobile Models by RAM used

4 GB RAM is used in most mobile models, followed by 6 GB RAM. 1GB RAM is used in less number of mobile models.

 

    1.6. Based on ROM:

Fig 1.6: No. of Mobile Models by ROM

128GB ROM is used in most mobile models, followed by 64GB ROM.

 

    1.7. Based on Ratings:

Fig 1.7: Based on Ratings

In online mobile sales mostly people  like to buy products based on ratings. So ratings play a vital role in sales. Based on the dataset, we can observe that most mobile models got the rating 4.3 followed by 4.4.

 

    2. Data Visualization Based on Mobile Sales Dataset:

    2.1. Mobile Sales based on Mobile Price:

Fig 2.1: Mobiles Sold by Price

From the above figure we can notice that, more mobiles are sold between the price range of Rs.8,000 to Rs.25,000. Very few mobiles are sold above the price range of Rs.40,000.

 

    2.2. Mobile Sold by Brand:

           Fig 2.2: Mobiles Sold by Brand

As we can see the brand which is sold more is Realme followed by Xiaomi. Brands which are preferred less are Samsung and Apple.

 

    2.3. Sales by Mobile brand and model:

Fig 2.3: No. of Mobiles Sold by brand and model

The above figure shows the number of mobiles sold by brand and model. 

 

    2.4. Results based on Analysis:

After analyzing the dataset, we can find that the top-end & middle level mobiles are sold more. High end mobile models like Iphone and Samsung with 128GB RAM are sold more. People with budget of around Rs.20,000, mostly goes for Realme brand. The online seller can focus more on Poco brand, which have cheaper price mobiles with good dimensions, ROM and RAM capacity.

 

    3.1. Heatmap to find the Dependent Variables

This heatmap is used to find the dependent variables which affects the independent variable. Here the Sales variable is the independent variable. The dependent variables are mostly No. of Rating followed by Color, Sales price, RAM, ROM etc.

                                      Fig 3.1: Heatmap to find the Dependent Variables

 

    3.2. Machine Learning Algorithm selection for Predictive Model:

Random forest, Gradient Booster, XG booster and Decision tree algorithms are tested with this dataset and Decision tree algorithm is selected based on the following evaluation metrics

MAE: 162.93051858966172

MSE: 2093.276611328125

RMSE: 45.752339954674724

R-squared: 0.3693307638168335

With this algorithm, the model is modest and trained effectively.

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done on exploratory data analysis, data preparation, model selection, feature selection, training and testing of different models.

 

 

Python Data Science Project: 4

                         MODEL FOR PREDICITING THE SELLING PRICE OF USED CARS USING DECISION TREE MACHINE LEARNING ALGORITHM

Aim:

Our aim is to predict the selling price of used cars based on different features that affect the price of the car. We have to train the most suitable machine learning algorithm with the data and create a robust model for prediction.

 

Dataset Analysis

The dataset can be found in the below link

cardata.csv

This dataset contains selling price of used cars based on the following features.

  • Car_Name            : The model name of the car that is sold.
  • Year                       : The car’s year of make.
  • Selling_Price        : This column gives information about the selling price of the car in lakhs.
  • Initial_Price          : This column gives information about the current market price of the car in lakhs.
  • Kms_Driven          : It gives information regarding the kilometers driven in that car.
  • Fuel_Type             : This field gives information about the fuel used, whether it is petrol or diesel or CNG.
  • Seller_Type          : This field gives information about who sells the car. A dealer or individual.
  • Transmission      : This field gives information about the car’s transmission, whether manual or automatic.
  • Owner                   : This field gives information about the number of previous owners.

 

    1. Comparison of Initial Price of the cars with their Selling Price

Here the initial price of the cars and their selling price are compared.

This comparison plot gives a good knowledge about which range of cars are sold more.

                                   Fig 1: Comparison plot between Selling price and Initial price

 

    2. Analysis between Kms Driven and Selling Price of the Cars

                                Fig 2: Plot between kms driven and selling price

The above plot gives the relation between the km driven by the car and its selling price. Here we can observe that according to the kms driven, the car’s selling price also varies. Please note that the color legend list displayed in this screenshot, don’t includes all the car names.

 

    3. Analysis of Selling Price based on Car Model

This plot shows the car model name, its selling price and the number of cars sold. From this chart, we can find that cars like Land cruiser, Fortuner, Corolla Altis, Innova and Creta have high selling price. The cars that are sold the most are City, Corolla Altis, Verna, Brio and Fortuner. Please note that the color legend list displayed in this screenshot, don’t includes all the car names.

                                          Fig 3: Plot about Cars, its Selling Price and its Count Sold

 

    4. Analysis of total revenue generated based on the year of make of the car

                 Fig 4: Pie chart showing Revenue based on the year of make of Car

The year of make of the car plays a vital role here. This dataset has data till the year 2017. From the pie chart, we can see that the cars bought between 2014 to 2017 have generally generated more revenue.

 

    5. Analysis based on the Vehicle’s Fuel Type

By looking at the pie chart, we can find that petrol vehicles are sold more compared to diesel vehicles. CNG vehicles are sold the least.

         Fig 5: Pie Chart about Fuel used in the Vehicles

 

    6. Analysis based on different features

    Fig 6: Overall Distribution of Sales based on different features

Here the features like fuel type, Transmission, Ownership and Seller type of the car are analyzed. Petrol cars with manual transmission, sold by first owners are sold more. Also cars are sold more by Dealers than Individuals.

 

    7. Correlation between Independent Variable and Other Variables

   Fig 7: Heatmap to find the correlation between Variables

From the heatmap, we can find that there is high correlation between Selling price column and Initial Price column of the car followed by Car model. The selling price is negatively correlated to fuel type, seller type and transmission.

 

    8. Model to predict the Selling Price of the Used Car

After the analysis, the next step is to build the model for the dataset to predict selling price of the used car. The output variable is continuous, so a regression model needs to be used here. We can consider different regression models like linear regression, random forest, gradient boost and decision tree regression models. If the model is robust then the R-square value should be near to 1.

 

Result:

The models are built and the selling price is predicted for the used car. The evaluation metrics for the decision tree regression are

Decision Tree Regressor Model Performance:

MAE: 1.08

MSE: 5.36

RMSE: 2.31

R-squared: 0.79

The above model is robust and better to predict the sales price for the used car.

By considering the evaluation metrics the best model for this dataset is Decision Tree Algorithm. The Prediction of Selling Price of a car for different input features is shown below.

         Fig 8: Predicted sales price for a used car

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done on exploratory data analysis, data preparation, model selection, feature selection, training and testing of different models.

 

 

Python Data Science Project: 5

                               WEBSERIES RECOMMENDATATION SYSTEM USING NLP

Aim:

The aim of this project is to give suggestions on web series to the users based on their likes and interests.

 

Dataset Analysis

The dataset can be found in the below link

All_Streaming_Shows.csv

This dataset contains details about webseries and their streaming platforms. The fields present in the dataset are given below

  • Series Title                      : The Name of the web series.
  • Year Released                 : The year of release of the web series.
  • Content Rating                : Gives information about which age people should watch.
  • IMDB Rating                     : Ratings given by the Internet Media Data Base website to the Movie.
  • R Rating                            : The Rating given by the Motion Pictures Association of America to the web series.
  • Genre                                : Information about the style of webseries.
  • Description                      : Gives short notes about the webseries.
  • No of Seasons                : Gives information on how many seasons the series have.
  • Streaming Platform       : Gives information about where the webseries is streaming.

 

    1. Analysis based on Genre Distribution

             Using genre distribution, it is easy to find which genres people like and watch the most. The below figure shows the top genres watched in different streaming platforms.

                              Fig 1: Bar Chart about Genre Distribution

 

    2. Analysis on Streaming Platform Distribution

The Data Analysis on Streaming platforms is useful for finding which streaming platform is used the most. The figure shown below gives details about streaming platform usage in this dataset.

                            Fig 2: Bar chart about Top Streaming Platforms

 

    3. Analysis on Content Ratings

The dataset is analyzed based on Content ratings which specifies which age people can see the particular series. The below chart shows that webseries with 16+ years ratings are watched the most. Followed by webseries with 7+ years and 18+ years ratings. Webseries with 13+ years ratings are the least watched.

              Fig 3: Analysis on Content Ratings

 

    4. Analysis on IMDB Ratings

IMDB means Internet Movie Database. In this website every webseries has their own rating for their shows given by registered users of IMDB. Based on the chart below, most of the shows have the rating closer to seven.

                         Fig 4: Analysis based on IMDB Rating

 

    5. Analysis based on R-Ratings Distribution

R-rating means Restricted Rating, which is given by the Motion Pictures Association of America. The below figure shows the distribution of ratings and its count. According to the chart, most of the webseries in this dataset have rating between 70 and 75.

                                     Fig 5: Distribution based on R- Rating

 

    6. Data Distribution based on Year of Release

The below graph shows the data distribution based on the year of release of the webseries. After the year 2000 the number of webseries released increased exponentially. Most number of webseries were released during 2017.

                    Fig 6: Distribution based on Year of Release of webseries

 

    7. Analysis of IMDB Rating based on Year of Release

The below chart shows the comparison between the variables IMDB Rating and Year of Release. From this chart we can find that, there is no effect of year of release on IMDB Rating.

                                    Fig 7: Distribution plot between Year of Release and IMDB Rating

 

    8. Recommendation System for Webseries Using NLP

Webseries recommendation System considers every aspect like Genre, Cast, Content rating, Description and Streaming platform. After looking at all the features, the system will find the similarities between respective vectors(tokens). Vectors are nothing but keywords from all the features we considered.

The below screenshot shows the output of the recommendation system based on the webseries, that user likes. The system will give maximum of ten suggestions to the users to watch next.

          Fig 8: Model Output based on the SHOW you like

The below screenshot shows the output of the suggestion, got by the user based on the genre they like.

        Fig 9: Recommendation of Webseries Based on Genre

 

    9. Result

The Recommendation System for this webseries dataset is built successfully using the features title, cast, description and the streaming platform and works well.

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done on exploratory data analysis, data preparation, training and testing of the model.

 

Python Data Science Project: 6

                                  IMAGE PROCESSING BASICS AND DETECTING TUMOURS USING IMAGE PROCESSING TECHNIQUES

Aim

The objective of this project is to gain knowledge about the basics of image processing and automation on detecting the tumors in brain using image processing techniques.

 

Basics of Image Processing

The picture shown below is uploaded using the package called pillow. Pillow is a popular library that can be used to import image, process it and can be used to do many operations on it.

 

    1. Importing Image

The first step is to import the image in the python notebook using the image processing package.

                               Fig 1: Importing image

 

    2. Rotating image

The image can be rotated based on our need from 1o to 360o . Here the uploaded image is rotated by 180o.

                                    Fig 2: Output of Rotated Image

 

    3. Gray Scaling

Gray scaling is nothing but converting each pixel, that are in original color representation (RGB) to single intensity value ranging from black to white.

Here half of the image is gray scaled for showing the difference between colored image and gray scaled image.

                                 Fig 3: Gray Scaled Image

 

    4. Blur

This helps to blur the image or parts that you want to blur. The image shown below is fully blurred.

                                         Fig 4: Blurred Image

 

    5. Resizing the Image

The image can be resized using attributes in the size values. The length of the image is resized to half of it and height of the image is increased by twice the existing value.

        Fig 5: Resized Image

 

    6. Converting Gray scaled image to RGB image

Here the image which is shown below is already in gray scaling, is converted to image with primary colors. This will give a output of basic colored image with RGB colors.

The below picture is the gray scaled image to be converted.

Fig 6: Gray Scaled Image

The picture shown below is the converted gray scaled image after the primary color implementation.

Fig 7: Gray Scale Image to RGB Image

 

    7. Detecting Tumor cells using Image Processing

Using Image processing techniques, is useful to find the tumors in the image.

It is an effective technique to find the small tumors in the body.  Now we are going to find the tumor in the brain using image processing. First step is to import the image of the scan.

Fig 9: Imported Image of Brain

 

    8. Heat map of the Image

          The heatmap is created to differentiate the tumor from the normal parts of the brain. Here tumor parts are in light color and other parts are in different colors. These are separated using the RGB values of the image.

Fig 10: Heatmap of the image

 

    9. Highlighting the index values of the tumor

          The below image gives the index value of the pixel grids which are with the tumor cells, by segregating the color range which are closer to white and grouping them together.

Fig 11: Indexing the tumor parts of brain

 

    10. Identifying the tumor area using Image Processing

          The scan image is Gray Scaled and the areas are selected to identify normal parts of the brain and the tumors. Based on the RGB values the pixels are checked and the tumor areas are identified and indicated with color to differentiate it.

Fig 12: Identified Area of  tumor in brain

 

    11. Differentiating the tumor and normal brain parts

To get a clear view and to identify the tumor parts, the brain area and tumor area are differentiated with different colors. By this, we can find the percentage of area that has tumor.

Fig 13: differentiating tumor cells from normal brain parts

 

12. Result

Here the basic knowledge about image processing is gained and the tumor in found using the image processing technique. The size of the tumor can also be found using this technique.

Percentage of clot is: 0.495%

 

13. Applications of Image Processing

Image processing is widely used in

  • Face identification.
  • Image to Text conversion
  • Object Detection
  • Hand writing Detection and etc.

 

Please note: This project is completely done in Python using Google Colab. Lot of work is done and techniques are used in image analysis, image preparation and image processing in this project.