Got a Tape Backup Bob?

Software Is The Language Of Automation

Jensen Huang

First, i trust everyone is safe. Second: Hey Now! Wednesday is already here again! Why did Willy Wonka say about “So Much Time And So Little To Do?” Or better yet “Time Is Fun When You Are Having Flies!” Snake_Byte[8] Time!

This is a serendipitous one because i stumbled onto a library that uses a library that i mentioned in my last Snake_Bytes which was pandas. It’s called MitoSheets and it auto-generates code for your data wrangling needs and also allows you to configure and graph within your Jupyter_Lab_Notebooks. i was skeptical.

So we will start at the beginning which is where most things start:

i am making the assumption you are either using a venv or conda etc. i use a venv so here are the installation steps:

pip install mitoinstaller
pip mitoinstaller install

Note the two step process you need both to instantiate the entire library.

Next crank up ye ole Jupyter Lab:

import mitosheets
mito.sheet()

It throws up a wonky splash screen to grab your digits and email to push you information on the Pro_version i imagine.

Then you can select a file. i went with the nba.csv file from the last blog Snake_Bytes[7] Pandas Not The Animal. Find it here :

Then low and behold it spit out the following code:

from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

register_analysis("id-ydobpddcec") is locked to the respective file.

So how easy is it to graph? Well, it was trivial. Select graph then X & Y axis:

Team Members vs Team Graph
Graph Configuration

So naturally i wanted to change the graph to purple and add some grid lines with a legend to test the export and here was the result:

They gotcha!

As Henry Ford said you can have any color car as long as it is black. In this case you are stock with the above graph while useful it’s not going to catch anyone’s eye.

Then i tried to create a pivot table and it spit out the following code:

from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Pivoted into nba
tmp_df = nba[['Team', 'Position', 'Number']]
pivot_table = tmp_df.pivot_table(
    index=['Team'],
    columns=['Number'],
    values=['Position'],
    aggfunc={'Position': ['count']}
)
pivot_table.set_axis([flatten_column_header(col) for col in pivot_table.keys()], axis=1, inplace=True)
nba_pivot = pivot_table.reset_index()

Note the judicious use of our friend the pandas library.

Changing the datatype is easy:

from salary to datatime_ascending
from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Changed Salary to dtype datetime
import pandas as pd
nba['Salary'] = pd.to_datetime(nba['Salary'], unit='s', errors='coerce')

It also lets you clear the current analysis:

Modal Dialog

So i started experimenting with the filtering:

Player Weight < 180.0 lbs
from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Filtered Weight
nba = nba[nba['Weight'] < 180]

The views for modification are on the right side of the layout of the table which is very convenient. The automatic statistics and visualizations are helpful as well:

Unique Ascending Values
Weight Frequencies < 180.0 lbs

The max,min,median, and std are very useful and thoughtful:

Rule Based Summary Statistics

The following in and of itself could be enough to pip install the library:

DataFrame Gymnastics

You can even have multiple dataframes that can be merged. Between those items and the summary stats for those that are experienced this could be enough price to entry to pip install and then install the library. For those that really don’t know how to code this allows you to copypasta code and learn some pretty basic yet very powerful immediate insights into data. Also if you are a business analyst, a developer could get you going in no time with this library.

i don’t particularly like the lockouts on the paywall for features. In today’s age of open-source humans will get around that issue and just use something else, especially the experienced folks. However, what caught my attention was the formatting and immediate results with a code base that is useful elsewhere, so i think the Mito developer team is headed in the right direction. i really can see this library evolving and adding sklearn and who knows Github Copilot. Good on them.

Give it a test drive.

Until Then,

#iwishyouwater <- #OuterKnown Tahiti Pro 2022 – Best Waves

@tctjr

Muzak To Blog By: Tracks from “Joe’s Garage” by Frank Zappa. “A Little Green Rosetta” is hilarious as well as a testament to Zappa’s ability to put together truly astound musicians. i love the central scrutinizer and “Watermelon in Easter Hey” i believe is one of the best guitar pieces of all time. Even Zappa said it was one of his best pieces and to this day Dweezil Zappa is the only person allowed to play it. One of my readers when i reviewed the Zappa documentary called the piece “intoxicating”. Another exciting aspect of this album is that he used live guitar solos and dubbed them into the studio work except for Watermelon In Easter Hey. The other Muzak was by a band that put Atlanta on the map: Outkast. SpeakerBoxx is phenomenal and Andre3000 is an amazing musician. “Prototype” and “Pink & Blue”. Wew.

Leave a Reply

Your email address will not be published. Required fields are marked *