Data Science| Introduction to orange Tool
a blog about how to use the orange tool with various purposes in data science. Here are some functions of the orange tool. So let's get started.
What is Orange Tool?
Orange is an open-source data visualization, machine learning, and data mining toolkit. It features a visual programming front-end for explorative data analysis and interactive data visualization, and can also be used as a Python library.
- Widgets: The various components present in Orange are known as widgets and they are divided into various categories like Data, Visualize, Model, Evaluate, and so on.
- Workflows: Orange workflows consist of components that read, process, and visualize data. We call them “widgets.” We place the widgets on a drawing board (the “canvas”). Widgets communicate by sending information along with a communication channel. An output from one widget is used as input to another.
Overview:
Here is the black canvas of orange where you will do all your data exploration. On the left-hand side, you can see there is a total of 5 sections and that all 5 sections contain different-different widgets which we will use in the future for data exploration.
Creating Workflow
so for getting started with orange I have created a simple workflow where there is an inbuild Iris database provided by Orange is being used. The workflow is such that data from the database is sent to the data table, to Distributions for creating a distribution and a Scatter Plot is plotted from the dataset.
To create this workflow we load the dataset using the File widget, and then flow between File-Data Info, File-Data Table, File-Distributions, and File-Scatter Plot is created. In this way, we can create a simple workflow in Orange.
I have created a simple workflow wherein the inbuilt Iris dataset provided by Orange is being used. The workflow is such that data from the dataset is sent to the data table, to Distributions for creating a distribution and a Scatter Plot is plotted from the dataset. To create this workflow we load the dataset using the File widget, and then flow between File-Data Info, File-Data Table, File-Distributions, and File-Scatter Plot is created.
For the data to be loaded in the Canvas, select the File widget from the left pane and place it in the canvas. Double click on the File widget and select the iris. tab file
Workflow:
There are lots of widgets available with the Orange tool we can connect those widgets with each other in a proper manner and that’s how we can generate workflow. We can also use some inbuilt workflow that comes with the tool. we can use that workflow for our tasks.
Check out the list of Workflow that comes with the tool here.
Classfication of tree workflow
As you can see in the bottom left corner we have option workflow examples. In the workflow examples, we have a lot of options for workflow.
Here I load Classification Tree Workflow.
After loading workflow you can see many widgets are connected with each other. We can also modify the widget as per our needs.
Classification Tree workflow uses to explore the classification of data using Decision tree methods. Let’s see the classification tree for Iris Dataset.
By clicking on Tree Viewer You can get this type of decision tree for your dataset.
Finally, at the end we have to scatter plot and box plot widgets which are connected with the tree viewer so first, you have to select the data from the tree viewer then you can visualize that selected data into the scatter and box plot widgets.
Data Exploration
We can visualize the dataset very easily and also get insights from data using this Orange tool.
I used the below workflow (created by me) to explain.
ML Model Workflow
For the exploration part I used the following widgets:
(1) Data Table: For viewing the information about the dataset in tabular form. Here Iris column is dark i.e it is the target variable.
Data Table Widget
(2) Distribution: It is used to getting information about the distribution of data.
(3) Scatter Plot: Used to visualize data using scatter plot
(4) Bar Plot: It will represent the data into bars. It is a very simple & basic plot.
Bar chart Widget
(5) Linear Projection: In this widget, you can visualize the data up to 3D. For higher dimensions, it will project the data on a linear plane.
(6) Violin Plot: A violin plot is a method of plotting numeric data. It is similar to a box plot, with the addition of a rotated kernel density plot on each side.
How to load your data in Orange and how to load external data from API in Orange?
To load your data in Orange select the File Widget and from there in you can either select the dataset provided by Orange or else browse to the dataset file in your local machine to load the data. If you want to load external data use can select the URL option in the File widget, where one can paste the external dataset link to load the data.
Basic of Orange Tool is covered in this blog. To get more about Orange Tool, visit the next blogs.