Snake_Byte[10] – Module Packages

Complexity control is the central problem of writing software in the real world.

Eric S. Raymond
AI-Generated Software Architecture Diagram

Hello dear readers! first i hope everyone is safe. Secondly, it is the mondy-iest WEDNESDAY ever! Ergo its time for a Snake_Byte!

Grabbing a tome off the bookshelf we randomly open and it and the subject matter today is Module Packages. So there will not be much if any code but more discussion as it were on the explanations thereof.

Module imports are the mainstay of the snake language.

A Python module is a file that has a .py extension, and a Python package is any folder that has modules inside it (or if your still in Python 2, a folder that contains an __init__.py file).

What happens when you have code in one module that needs to access code in another module or package? You import it!

In python a directory is said to be a package thus imports are known as package imports. What happens in import is that the code is turned into a directory from a local (your come-pooter) or that cloud thing everyone talks about these days.

It turns out that hierarchy simplifies the search path complexities with organizing files and trends toward simplifying search path settings.

Absolute imports are preferred because they are direct. It is easy to tell exactly where the imported resource is located and what it is just by looking at the statement. Additionally, absolute imports remain valid even if the current location of the import statement changes. In addition, PEP 8 explicitly recommends absolute imports. However, sometimes they get so complicated you want to use relative imports.

So how do imports work?

import dir1.dir2.mod
from dir1.dir2.mod import x

Note the “dotted path” in these statements is assumed to correspond to the path through the directory on the machine you are developing on. In this case it leads to mod.py So in this case directory dir1 which is subdirectory dir2 and contains the module mod.py. Historically the dot path syntax was created for platform neutrality and from a technical standpoint paths in import statements become object paths.

In general the leftmost module in the search path unless it is a home directory top level file is exactly where the file presides.

In Python 3.x packages changed slightly and only applies to imports within files located in package directories. The changes include:

  • Modifies the module import search path semantic to skip the package’s own directory by default. These checks are essentially absolute imports
  • Extension of the syntax f from statements to allow them to explicitly request that imports search the packages directories only, This is the relative import mentioned above.

so for instance:

from.import spam #relative to this package

Instructs Python to import a module named spam located in the same package directory as the file in which this statement appears.

Similarly:

from.spam import name

states from a module named spam located in the same package as the file that contains this statement import the variable name.

Something to remember is that an import without a leading dot always causes Python to skip the relative components of the module import search path and looks instead in absolute directories that sys.path contains. You can only force the dot nomenclature with relative imports with the from statement.

Packages are standard now in Python 3.x. It is now very common to see very large third-party extensions deployed as part of a set of package directories rather than flat list modules. Also, caveat emptor using the relative import function can save memory. Read the documentation. Many times importing AllTheThings results in major memory usage an issue when you are going to production with highly optimized python.

There is much more to this import stuff. Transitive Module Reloads, Managing other programs with Modules (meta-programming), Data Hiding etc. i urge you to go into the LazyWebTM and poke around.

in addition a very timely post:

PyPl is running a survey on packages:

Take the survey here -> PyPl Survey on Packages

Here some great comments and suggestions via Y-Combinator News:

Y-Combinator News Commentary on PyPl Packages,

That is all for now. i think next time we are going to delve into some more scientific or mathematical snake language bytes.

Until Then,

#iwishyouwater <- Wedge top 50 wipeouts. Smoookifications!

@tctjr

MUZAK TO BLOG BY: NIN – “The Downward Spiral (Deluxe Edition)”. A truly phenomenal piece of work. NIN second album, trent reznor told jimmy iovine upon delivering the concept album “Im’ Sorry I had to…”. In 1992, Reznor moved to 10050 Cielo Drive in Benedict Canyon, Los Angeles, where actress Sharon Tate formally lived and where he made the record. i believe it changed the entire concept of music and created a new genre. From an engineering point of view,  Digidesign‘s TurboSynth and  Pro Tools were used extensively.

It Is An Honor To Say “GoodBye”.

No one ever told me that grief felt so like fear.

C.S. Lewis
An AI-Generated Image

First, i hope everyone is safe, especially on this day when belief systems ran completely amok. Second, this day also holds a place for me that i will not go into but if you are a good internet sleuth you can figure it out.

Today i did something i have never done nor did i think i could do because of several factors. However into the breach once more and low and behold i pulled it off. The man with me is an expert at this activity and gave me some pointers as to how to perform the said activity. As i was saying goodbye to the man who is one of the closest people in my life we volitionally hugged each other and shook hands a certain way.

On this day i reflected on an Uber ride that i had years ago where a man picked me up. We started talking as it was a pretty good drive from SFO to the Marines’ Memorial Club & Hotel where i was speaking.

There are places I’ll remember
All my life though some have changed
Some forever, not for better
Some have gone and some remain

All these places have their moments
With lovers and friends I still can recall
Some are dead and some are living
In my life I’ve loved them all

The driver as it turns out was a former senior salesperson at salesforce. As i always say you never know what someone has been through so don’t judge them by how they make a living. We discussed most of the “-isms” and then he said, “Mr Ted i found comfort in the christian bible. Have you read it?” i said i have read it three times and i prefer the old testament. i asked him why? he said it helped him through the hard times of his life. He was talking about his family in past tense and i was very sensitive to prying to much into his business. i asked him what type of hardships. He said his family lived during the years of Pol Pot and the Cambodian genocide and his family were all murdered. i really didnt know what to say except “My Condolences”. He said, “Thank you Mr Ted. i have found peace and remember it is an honor to say goodbye to someone and to always make it count as you never know when you will see them again. As a matter of fact i do not tell people Goodbye i say i love you or be safe.”

We arrived at the Marine Hotel. We got out of the car and he said , “Mr Ted it has been an honor speaking with you i hope you enjoy your life. Be Safe Mr Ted.”

That left an indelible imprint on my mind.

Though I know I’ll never lose affection
For people and things that went before
I know I’ll often stop and think about them
In my life I love you more

On 9.11 – Today many lost loved ones. Grief, as Mr Lewis states, is very much like fear except you cannot Un-Grieve. You can be unafraid. Grief, as it turns out, is never-ending. There is no invertible transformation that makes you not grieve.

We have been so programmed to buck it up – suck it up, buttercup that everything tries to gloss over the loss. Whether a human or a family pet, it is ok to grieve. There are people and animals in my life that i will never recover from losing and for the longest time i beat myself up for not bucking up buttercup.

Further contemplating this i believe Grief is fractal. Zoom in on a fractal it evolves and changes yet holds the same shape ad infinitum1.

Mandelbrot Set Generated Fractal

Grief as it turns out appears at least to me to be closely aligned. The more you peel it back the more complex it gets.

Same Fractal Zoomed

Does time heal Grief? Not really. It is the memory that fades. Ergo other memories fade as a function of our leaky memory system.

We deal with healing in different ways. The Uber driver found solace in a religious text, others workout, some self-medicate, others try to replace the human or animal.

We want it to go away.

i say we should acknowledge the pain of grief and realize it and let it happen then further acknowledge that the next person or animal who is essential to you, use the opportunity and find strength in telling them “Be Safe, See ya Real Soon, or i love you more.” However above all, if you cherish that friend or loved one, it is an honor to tell them upon them walking out the door. Let them know it.

Until Then,

#iwishyouwater. <- Laird Hamilton on a Paddle board

@tctjr

Muzak To Blog By: A band called Papir.

[1] The Mandelbrot set is the set of complex numbers c for which the function $$f_{c}=z^2+c$$ does not diverge to infinity when iterated from $$z=0$$

Snake_Byte[9] XKCD PLOTS

An algorithm must be seen to be believed.

~ D. Knuth

First i trust everyone is safe. Second its WEDNESDAY so we got us a Snake_Byte! Today i wanted to keep this simple, fun and return to a set of fun methods that are included in the defacto standard for plotting in python which is Matplotlib. The method(s) are called XKCD Style plotting via plt.xkcd().

If you don’t know what is referencing it is xkcd, sometimes styled XKCD, whcih is a webcomic created in 2005 by American author Randall Munroe. The comic’s tagline describes it as “a webcomic of romance, sarcasm, math, and language”. Munroe states on the comic’s website that the name of the comic is not an initialism but “just a word with no phonetic pronunciation”. i personally have read it since its inception in 2005. The creativity is astounding.

Which brings us to the current Snake_Byte. If you want to have some fun and creativity in your marketechure[1] and spend fewer hours on those power points bust out some plt.xkcd() style plots!

First thing is you need to install matplotlib:

pip install matplotlib

in this simple example we need numpy:

pip install numpy
import numpy as np
plt.xkcd() 
plt.plot(np.sin(np.linspace(0, 10)))
plt.plot(np.sin(np.linspace(10, 20)))
plt.title('Sorta Lissajous')
Sorta Lissajous

So really that is all there with all the bells and whistles that matplotlib has to offer.

The following script was based on Randall Munroe’s Stove Ownership.

(Some will get the inside industry joke.)

with plt.xkcd():
    # Based on "Stove Ownership" from XKCD by Randall Munroe
    # https://xkcd.com/418/

    fig = plt.figure()
    ax = fig.add_axes((0.1, 0.2, 0.8, 0.7))
    ax.spines.right.set_color('none')
    ax.spines.top.set_color('none')
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_ylim([-30, 10])

    data = np.ones(100)
    data[70:] -= np.arange(30)

    ax.annotate(
        'THE DAY I TRIED TO CREATE \nAN INTEROPERABLE SOLTUION\nIN HEALTH IT',
        xy=(70, 1), arrowprops=dict(arrowstyle='->'), xytext=(15, -10))

    ax.plot(data)

    ax.set_xlabel('time')
    ax.set_ylabel('MY OVERALL MENTAL SANITY')
    fig.text(
        0.5, 0.05,
        '"Stove Ownership" from xkcd by Randall Munroe',
        ha='center')
Interoperability In Health IT

So dear readers there it is an oldie but goodie and it is so flexible! Add it to your slideware or maretechure or just add it because its cool.

Until Then,

#iwishyouwater <- Mentawis surfing paradise. At least someone is living.

Muzak To Blog By: NULL

[1] Marchitecture is a portmanteau of the words marketing and architecture. The term is applied to any form of electronic architecture perceived to have been produced purely for marketing reasons and has in many companies replaced actual software creation.

What Are You Good At?

Panda Says…

The thing that you are most comfortable with that you do the best.

Steve Vai

First as always i hope everyone is safe. Second this blog is out of band so to speak. This question was posited to me during a technical discussion with some great folks whom i had just met and we were discussing re-tooling and scaling enterprise systems.

Completely almost non-sequitur this executive asked:

Q: What do you think you are good at?

There isnt too many times when i am personally caught off gaurd but i stopped and replied “Well that is a great question. So good in fact i am going to write a blog if you dont mind. “

Which then got me to thinking:

Q: If you really truly love what you are doing is it really work?

Not at all.

This goes along with several of the blogs i have done in the past concerning – What Is IT You Truly Want?

So without any hubris or narcissism as far as i know or have been told what i am good at is the following:

Evidently, i have an uncanny ability to see what needs to be built with the right team and at a pretty good time or within a certain timeframe.

The other thing i am supposedly good at is getting people aligned and excited around a common vision to execute said code base or system(s).

i also take the title CTO very seriously. i do pride myself on keeping up with technology. i attempt to find out what works and what does not work. Mathematics and Software completely and unequivocally enthrall me. i am always either reading a book, paper or blog on a technical subject. i don’t really keep up with the normal outside world. So i can’t really comment on sports, movies, or the daily news. Ergo i don’t go around chasing fads however many times you do have to create net new warez. i tend to go deep on technical subjects before i bring them into an organization.

Really after thinking about all of this is it useful? After thinking more about it i really cannot say at this time. Maybe it’s an occupational hazard nowadays. However, it is what i maybe think i am good at so to speak.

However, i can assure you that holding that mirror up to yourself and looking deep into it is an exercise we should all do on a daily basis. As the famous song lyric goes “Chickity-check yo’ self before you wreck yo’ self”.

So finding out what you are really good at and strengthening that creates a self-perpetuating system. It has been said You Are Your Best Charity. If you truly enjoy what you are doing then it really isn’t work is it? This allows you to concentrate on what you are good at and then in turn Amplifying_OthersTM.

Find Your Passion At All Costs.

Become the system You are creating.

Then IT will naturally happen.

I posted this video of Steve Vai a long time ago in another blog and in another life. i ran across it while taking a break at 2:30 AM EST working on a very serious bug with a company i co-founded. It paused me. i always come back to it. While this is supposedly a master class with Steve Vai he never talks about guitar technique but 1rather how to be successful (really at anything).

Possibly the only thing that i think is more amazing than creating software is music creation. Here is Mr. Steve Vai doing what he does best. Do yourself a favor, watch and listen. i’ll hopefully be seeing him soon in Charleston, SC.

Vai Virtuosity

Would love some comments and feedback on this blog. While it is short i have to tell you it was difficult to type those couple of sentences.

Until Then,

#iwishyouwater <- Will Trubridge 60M Freedive in 60 seconds

@tctjr

Muzak To Blog By: Steve Vai.

Snake_Bytes[8] Intro_To_Mito

Got a Tape Backup Bob?

Software Is The Language Of Automation

Jensen Huang

First, i trust everyone is safe. Second: Hey Now! Wednesday is already here again! Why did Willy Wonka say about “So Much Time And So Little To Do?” Or better yet “Time Is Fun When You Are Having Flies!” Snake_Byte[8] Time!

This is a serendipitous one because i stumbled onto a library that uses a library that i mentioned in my last Snake_Bytes which was pandas. It’s called MitoSheets and it auto-generates code for your data wrangling needs and also allows you to configure and graph within your Jupyter_Lab_Notebooks. i was skeptical.

So we will start at the beginning which is where most things start:

i am making the assumption you are either using a venv or conda etc. i use a venv so here are the installation steps:

pip install mitoinstaller
pip mitoinstaller install

Note the two step process you need both to instantiate the entire library.

Next crank up ye ole Jupyter Lab:

import mitosheets
mito.sheet()

It throws up a wonky splash screen to grab your digits and email to push you information on the Pro_version i imagine.

Then you can select a file. i went with the nba.csv file from the last blog Snake_Bytes[7] Pandas Not The Animal. Find it here :

Then low and behold it spit out the following code:

from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

register_analysis("id-ydobpddcec") is locked to the respective file.

So how easy is it to graph? Well, it was trivial. Select graph then X & Y axis:

Team Members vs Team Graph
Graph Configuration

So naturally i wanted to change the graph to purple and add some grid lines with a legend to test the export and here was the result:

They gotcha!

As Henry Ford said you can have any color car as long as it is black. In this case you are stock with the above graph while useful it’s not going to catch anyone’s eye.

Then i tried to create a pivot table and it spit out the following code:

from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Pivoted into nba
tmp_df = nba[['Team', 'Position', 'Number']]
pivot_table = tmp_df.pivot_table(
    index=['Team'],
    columns=['Number'],
    values=['Position'],
    aggfunc={'Position': ['count']}
)
pivot_table.set_axis([flatten_column_header(col) for col in pivot_table.keys()], axis=1, inplace=True)
nba_pivot = pivot_table.reset_index()

Note the judicious use of our friend the pandas library.

Changing the datatype is easy:

from salary to datatime_ascending
from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Changed Salary to dtype datetime
import pandas as pd
nba['Salary'] = pd.to_datetime(nba['Salary'], unit='s', errors='coerce')

It also lets you clear the current analysis:

Modal Dialog

So i started experimenting with the filtering:

Player Weight < 180.0 lbs
from mitosheet import *; register_analysis("id-ydobpddcec");
    
# Imported nba.csv
import pandas as pd
nba = pd.read_csv(r'nba.csv')

# Filtered Weight
nba = nba[nba['Weight'] < 180]

The views for modification are on the right side of the layout of the table which is very convenient. The automatic statistics and visualizations are helpful as well:

Unique Ascending Values
Weight Frequencies < 180.0 lbs

The max,min,median, and std are very useful and thoughtful:

Rule Based Summary Statistics

The following in and of itself could be enough to pip install the library:

DataFrame Gymnastics

You can even have multiple dataframes that can be merged. Between those items and the summary stats for those that are experienced this could be enough price to entry to pip install and then install the library. For those that really don’t know how to code this allows you to copypasta code and learn some pretty basic yet very powerful immediate insights into data. Also if you are a business analyst, a developer could get you going in no time with this library.

i don’t particularly like the lockouts on the paywall for features. In today’s age of open-source humans will get around that issue and just use something else, especially the experienced folks. However, what caught my attention was the formatting and immediate results with a code base that is useful elsewhere, so i think the Mito developer team is headed in the right direction. i really can see this library evolving and adding sklearn and who knows Github Copilot. Good on them.

Give it a test drive.

Until Then,

#iwishyouwater <- #OuterKnown Tahiti Pro 2022 – Best Waves

@tctjr

Muzak To Blog By: Tracks from “Joe’s Garage” by Frank Zappa. “A Little Green Rosetta” is hilarious as well as a testament to Zappa’s ability to put together truly astound musicians. i love the central scrutinizer and “Watermelon in Easter Hey” i believe is one of the best guitar pieces of all time. Even Zappa said it was one of his best pieces and to this day Dweezil Zappa is the only person allowed to play it. One of my readers when i reviewed the Zappa documentary called the piece “intoxicating”. Another exciting aspect of this album is that he used live guitar solos and dubbed them into the studio work except for Watermelon In Easter Hey. The other Muzak was by a band that put Atlanta on the map: Outkast. SpeakerBoxx is phenomenal and Andre3000 is an amazing musician. “Prototype” and “Pink & Blue”. Wew.

What Is Your KulChure?

Got It?

We are organized like a startup. We are the biggest startup on the planet.

S. Jobs

First, i hope everyone is safe. Second, this blog is about something everyone seems to be asking me about and talking about, but no one seems to be able to execute the concept much like interoperability in #HealthIT. Third, it is long-form content so in most cases tl;dr.

CULTURE.

Let us look to Miriam-Websters OnLine Dictionary for a definition – shall we?

cul·​ture <ˈkəl-chər>

1

a: the customary beliefs, social forms, and material traits of a racial, religious, or social group also the characteristic features of everyday existence (such as diversions or a way of life) shared by people in a place or time ; popular culture, Southern culture

b: the set of shared attitudes, values, goals, and practices that characterizes an institution or organization a corporate culture focused on the bottom line

c: the set of values, conventions, or social practices associated with a particular field, activity, or societal characteristic studying the effect of computers on print culture

d: the integrated pattern of human knowledge, belief, and behavior that depends upon the capacity for learning and transmitting knowledge to succeeding generations

2

a: enlightenment and excellence of taste acquired by intellectual and aesthetic training

b: acquaintance with and taste in fine arts, humanities, and broad aspects of science as distinguished from vocational and technical skills; a person of culture

3: the act or process of cultivating living material (such as bacteria or viruses) in prepared nutrient media also a product of such cultivation

4: CULTIVATIONTILLAGE

5: the act of developing the intellectual and moral faculties especially by education

6: expert care and training; beauty culture

Wow, This sounds complicated. Which one to leave in and which one to leave out?

Add to this complexity the fact that creating and executing production software is almost an insurmountable task. i have said for years software creation is one of the most significant human endeavors of all time. i also believe related to these concerns the interplay between comfort and solutions. Most if not all humans desire solutions however as far as i can tell solutions are never comfortable. Solutions involve change most humans are homeostatic. Juxtapose this against the fact that humans love comfort. So what do you do?

So why does it seem like everyone is talking about kəl-chər? i consider this to be like Fight Club. 1st rule of kəl-chər is you don’t talk about culture. It should be an implicit aspect of your organization. Build or Re-Build it at a first principles engineering practice. Perform root cause analysis of the behaviors within the company. If it does in fact need to be re-built start with you and your leadership. Turn the mirror on you first. Understand that you must lead by example. Merit Not Inherit.

i’ve recently been asked how you change and align culture. Well here are my recommendations and it comes down to TRUST at ALL levels of the company.

Create an I3 Lab: Innovation, Incubation, Intrapreneurship:

Innovation without code is just ideas and everyone has them. Ideas are cheap. Incubation without product market fit is a dead code base. Intrapreneurship is the spirit of a system that encourages employees to think and act like individual entrepreneurs and empowers them to take action, embrace risk, and make decisions as if they had founded the company themselves. Innovate – create the idea – Incubate – create the Maximum Viable Product (not minimum) – Intrapreneurship – spin out the Maximum Viable Product. As an aside Minimum Viable Product sounds like you bailed out making the best you possibly could in the moment. Take that Maximum Viable product and roll it into a business vertical and go to market strategy – then spin the wheel again.

I think it’s very important to have a feedback loop, where you’re constantly thinking about what you’ve done and how you could be doing it better.

E. Musk

Value The Most Important Asset – Your People

Managing high-performance humans is a difficult task because most high-performance humans do not like to be managed they love to be led. Lead them by example. Value them and compensate them accordingly. Knowledge workers love achievement and goals. Lead them into the impossible, gravitate toward dizzying heights, and be there for them. Be completely transparent and communicate. Software is always broken. If anyone states differently they are not telling the truth. There is always refactoring, retargeting, more code coverage and nascent bugs. Let them realize you realize that however let them know that if they do make a mistake escalate immediately. Under no circumstances can you tolerate surprises. Give them the framework with OKRs and KPIs that let them communicate openly, efficiently and effectively and most important transparently. Great teams will turn pencils into Mount Blanc Fountain Pens. Let them do what they do best and reward them!

Process Doesn’t Make A Culture

Nor does it make great products. Many focus on some software process. Apple used and as far as i know still uses strict waterfall. As far as i am concerned, we are now trending towards a Holacracy type of environment which is a self-organizing environment. However, this only can be achieved with the proper folks that appreciate the friction of creating great products from the best ideas. The Process of evolving from an idea to a product is magic. You learn you evolve; you grow your passion for and into the product as it becomes itself the team that built the product. Your idea and passion are inherent in that shipping software (or hardware).

What do you want me to do

To do for you to see you through?

The Grateful Dead

Empower Your People

Provide your people the ability to manage themselves and have autonomy. Set them free. Trust them to make the decisions that will drive the company and projects into world-class endeavors. Take a chance with them, Let a new college graduate push some code to production. Let a new sales associate push a deal with a customer. Let your new marketing person design an area on the company site. Allow them to evolve grow and be a part of the Great Endeavor. Put them in charge and provide the framework for autonomy to make decisions and when they deliver – award them not with something ephemeral but volitional. Money and Stock work wonders. Empower. Align. Evolve.

Provide and Articulate a Common Vision

Provide a vision of the company or project. Two sentences that everyone understands. Most people who are empowered and given the frameworks to create within know what to do in these circumstances. Articulate the vision and gain common alignment across the organization or project. That is leadership that high performance teams desire. Take this alignment then map it into the OKRs and KPIs then in turn pick a process and let everyone know how this aligns to the vision. Create the environment that every line of code maps to that vision. Show commitment on this vision.

Give FeedBack

Communicate. Communicate. Communicate. Collaborate. Collaborate. Collaborate. Till you puke. i cannot emphasize this enough. You must be prepared everyday to manically interact with your teams and have the hard friction filled uncomfortable discussions. You want to keep the top performers let them know where they stand, how they stand and why they stand in the rankings and how they are contributing to the vision. Again attempt to create coder-metrics across your organization or project that exemplifies this performance. Interact with your most important asset your people. Over communicate. We have the ability to reach everyone at anytime, email, zoom, slack, granite tablet where once used to message. Write the message and give feedback. Better yet go take a walk with them. Have 1:1s. Listen to your people receptively and without bias and judgment about their concerns, passions, what scares them, what makes them happy, their joys, goals, and aspirations so they feel validated and understood. Solicit feedback, shut up and listen.

What all of this comes down to what i call – Amplifying_OthersTM. This is easier said than done. Personally, i believe that you need to commit even to the point of possibly finding them a better fit for a position at another company. This goes back to understanding what truly drives the only asset there is in technology the people. Always Be Listening, Always Be Networking, and Always Be Recruiting.

This brings up the next big question for your company – How do you attract the best right talent? Hmmmm… that might be another blog. Let me know your thought on these matters in the comments.

Until Then,

#IWishYouWater <- Psycho Session In Mentawis

@tctjr

Music To Blog By:

American Beauty by The Grateful Dead. Box of Rain and Ripple are amazing. Also if you haven’t heard Jane’s Addiction’s cover of Ripple check it out. i am not a Dead fan but the lyrics on some of these songs are monumental.

References (click for purchase link):

The Psychology of Computer Programming

Mythical Man Month

The Essence of Software: Why Concepts Matter for Great Design

Snake_Byte[7]: Pandas (Not The Animal)

Groupings Of Pandas In A Frame

DISCLAIMER: This blog was written some time ago. Software breaks once in a while and there was a ghost in my LazyWebTM machine. We are back to our regularly scheduled program. Read on Dear Reader, and your humble narrator apologizes.

The other day i was talking to someone about file manipulations and whatnot and mentioned how awesome Pandas and the magic of the df.DoWhatEverYaWant( my_data_object) via a dataframe was and they weren’t really familiar with Pandas. So being that no one knows everything i figured i would write a Snake_Byte[] about Pandas. i believe i met the author of pandas – Wes Mckinney at a PyData conference years ago at Facebook. Really nice human and has created one of the most used libraries for data wrangling.

One of the most nagging issues with machine learning, in general, is the access of high integrity canonical training sets or even just high integrity data sets writ large.

By my estimate over the years having performed various types of learning systems and algorithm development, machine learning is 80% data preparation, 10% data piping, 5% training, and 5% banging your head against the keyboard. Caveat Emptor – variable rates apply, depending on the industry vertical.

It is well known that there are basically three main attributes to the integrity of the data: complete, atomic, and well-annotated.

Complete data sets mean analytical results for all required influent and effluent constituents as specified in the effluent standard for a specific site on a specific date.

Atomic data sets are data elements that represent the lowest level of detail. For example, in a daily sales report, the individual items that are sold are atomic data, whereas roll-ups such as invoices and summary totals from invoices are aggregate data.

Well-annotated data sets are the categorization and labeling of data for ML applications. Training data must be properly categorized and annotated for a specific use case. With high-quality, human-powered data annotation, companies can build and improve ML implementations. This is where we get into issues such as Gold Standard Sets and Provenance of Data.

Installing Pandas:

Note: Before you install Pandas, you must bear in mind that it supports only Python versions 3.7, 3.8, and 3.9.

I am also assuming you are using some type of virtual environment.

As per the usual installation packages for most Python libraries:

pip install pandas

You can also choose to use a package manager in which case it’s probably already included.

#import pandas pd is the industry shorthand
import pandas as pd
#check the version
pd.__version__
[2]: '1.4.3'

Ok we have it set up correctly.

So what is pandas?

Glad you asked, i have always thought of pandas as enhancing numpy as pandas is built on numpy. numpy It is the fundamental library of python, used to perform scientific computing. It provides high-performance multidimensional arrays and tools to deal with them. A numPy array is a grid of values (of the same type) indexed by a tuple of positive integers, numpy arrays are fast, easy to understand, and give users the right to perform calculations across arrays. pandas on the other hand provides high-performance, fast, easy-to-use data structures, and data analysis tools for manipulating numeric data and most importantly time series manipulation.

So lets start with the pandas series object which is a one dimensional array of indexed data which can be created from a list or an array:

data = pd.Series([0.1,0.2,0.3,0.4, 0.5])
data
[5]: 0    0.1
     1    0.2
     2    0.3
     3    0.4
     4    0.5
     dtype: float64

The cool thing about this output is that Series creates and wraps both a sequence and the related indices; ergo we can access both the values and index attributes. To double check this we can access values:

[6]: data.values
[6]: array([0.1, 0.2, 0.3, 0.4, 0.5])

and the index:

[7]: data.index
[7]: RangeIndex(start=0, stop=5, step=1)

You can access the associated values via the [ ] square brackets just like numpy however pandas.Series is much more flexible than the numpy counterpart that it emulates. They say imitation is the highest form of flattery.

Lets go grab some data from the LazyWebTM:

If one really thinks about the aspects of pandas.Series it is really a specialized version of a python dictionary. For those unfamiliar a dictionary (dict) is python structure that maps arbirtrary keys to a set of arbitrary values. Super powerful for data manipulation and data wrangling. Taking this is a step further pandas.Series is a structure that maps typed keys to a set of typed values. The typing is very important whereas the type-specific compiled code within numpy arrays makes it much more efficient than a python list. In the same vein pandas.Series is much more efficient python dictionaries. pandas.Series has an insane amount of commands:

Find Series Reference Here.

Next, we move to what i consider the most powerful aspect of pandas the DataFrame. A DataFrame is a data structure that organizes data into a 2-dimensional table of rows and columns, much like a spreadsheet. DataFrames are one of the most common data structures used in modern data analytics because they are a flexible and intuitive way of storing and working with data.

# Python code demonstrate creating 
# DataFrame from dict narray / lists 
# By default addresses.
 
import pandas as pd
 
# intialise data of lists.
data = {'Name':['Bob', 'Carol', 'Alice', ''],
        'Age':[18, 20, 22, 24]}
 
# Create DataFrame
df = pd.DataFrame(data)
 
# Print the output.
print(df)
 [8]:
    Name  Age
0    Bob   18
1  Carol   20
2  Alice   22
3          24       

Lets grab some data. nba.csv is a flat file of NBA statistics of players:

Get the NBA data file here.

i don’t watch or follow sports so i don’t know what is in this file. Just did a google search for csv statistics and this file came up.

# importing pandas package
import pandas as pd
 
# making data frame from csv file
data = pd.read_csv("nba.csv", index_col ="Name")
 
# retrieving row by loc method
first = data.loc["Avery Bradley"]
second = data.loc["R.J. Hunter"]
 
 
print(first, "\n\n\n", second)
[9]:
Team        Boston Celtics
Number                 0.0
Position                PG
Age                   25.0
Height                 6-2
Weight               180.0
College              Texas
Salary           7730337.0
Name: Avery Bradley, dtype: object 


Team        Boston Celtics
Number                28.0
Position                SG
Age                   22.0
Height                 6-5
Weight               185.0
College      Georgia State
Salary           1148640.0
Name: R.J. Hunter, dtype: object

How nice is this? Easy Peasy. It seems almost too easy.

For reference here is the pandas.Dataframe reference documentation.

Just to show how far reaching pandas is now in the data science world for all of you who think you may need to use Spark there is a package called PySpark. In PySpark A DataFrame is equivalent to a relational table in Spark SQL, and can be created using various functions. Once created, it can be manipulated using the various domain-specific-language (DSL) functions  much like your beloved SQL.

Which might be another Snake_Byte in the future.

i also found pandas being used in ye ole #HealthIT #FHIR for as we started this off csv manipulation. Think of this Snake_Byte as an Ouroboros.

This github repo converts csv2fhir ( can haz interoperability? ):

with pd.read_csv(file_path, **csv_reader_params) as buffer:
        for chunk in buffer:

            chunk: DataFrame = execute(chunk_tasks, chunk)

            # increment the source row number for the next chunk/buffer processed
            # add_row_num is the first task in the list
            starting_row_num = chunk["rowNum"].max() + 1
            chunk_tasks[0] = Task(name="add_row_num", params={"starting_index": starting_row_num})

            chunk: Series = chunk.apply(_convert_row_to_fhir, axis=1)

            for processing_exception, group_by_key, fhir_resources in chunk:
                yield processing_exception, group_by_key, fhir_resources

So this brings us to the end of this Snake_Byte. Hope this gave you a little taste of a great python library that is used throughout the industry.

Muzak To Blog By:

Mike Patton & The Metropole Orchestra – Mondo Cane – June 12th 2008 (Full Show) <- A true genius at work!

One other mention on the Muzak To Blog By must go to the fantastic Greek Composer, Evángelos Odysséas Papathanassíou (aka Vangelis) who recently passed away. We must not let the music be lost like tears in the rain, Vangelis’ music will live forever. Rest In Power, Maestro Vangelis. i have spent many countless hours listening to your muzak and now the sheep are truly dreaming. Listen here -> Memories Of Green.

Ordo Ab Chao – Or Embrace Uncertainty

Masonic Motto: Ordo Ab Chao
Eternal Golden Braid

Only he who undertakes dizzying ventures is authentically human. A single chain of peaks links Prometheus with Siegfried.

~ Jean Mabire

Hello all first as always i hope everyone is safe. Second, i have not been able to write as much as i would have liked, life happens. More on that in later blogs this year.

Now to the current installment. i do not write about specific work topics at all however i am making an exception here as it pertains to the title of the blog. W.R.T. (with respect to) the title this is not some FreeMason diatribe so you can take off the tinfoil hats and stick with me, please dear reader.

If you saw this news Fransisco Partners of San Francisco, California purchased IBM, Watson Health of which i am currently the Global CTO and Chief Architect. Personally, i am extremely excited about this situation as it appears others were as well considering the number of inbound texts, emails, calls that i received. i truly appreciated all of the correspondence. It is an amazing opportunity with an amazing firm that understands the health technology industry with many well-placed investments.

Why does this pertain to the blog subject line? Change causes randomization, randomization can cause chaos. All of it yields Uncertainty. It also is an indicator of Entropy. I wrote a blog some time ago on Randomness.

One of my favorite equations and possible my favorite equation is one Entropy:

This is the original one based on Bolztman’s derivation:

$$S = k_\mathrm{B} \ln W$$

However, me being an information science type person i prefer the entropy of a channel made famous by one of my heroes Claude Shannon:

$$E = -1/N*\sum_{i}^{N}(p_{i})log(p_{i})$$

(Note: As 𝑁→∞ this gives an entropy which is solely related to the distribution shape and does not depend on 𝑁.)

Entropy (/ˈentrəpē/) is a measure of a thermodynamic quantity representing the unavailability of a system’s thermal energy for conversion into mechanical work, often interpreted as the degree of disorder or randomness in the system or lack of order or predictability; gradual decline into disorder ergo our current subject of this blog.

Many people want to think or be told that it will be A-OK. Or OK. Being a word nerd I looked up the etymology of both A-OK and OK. Here is what i found:

The expression A-Okay:

Means everything is fine. A-Okay is a space-age expression. It was used in 1961 during the flight of astronaut Alan Shepard. He was the first American to be launched into space. His flight ended when his spacecraft landed in the ocean, as planned. Shepard reported: “Everything is A-Okay.”

However, some experts say the expression did not begin with the space age. One story says it was first used during the early days of the telephone to tell an operator that a message had been received.

Then there is OK (Okay):

The gesture was popularized in the United States in 1840 as a symbol to support then Presidential candidate Martin Van Buren. This was because Van Buren’s nickname, Old Kinderhook, derived from his hometown of Kinderhook, NY, had the initials O. K. i had no idea.

There are also fun ways to say okay. Some people say okey-dokey or okey-doke. Now with text, we have kk in some cases. i have even heard people say I’m doing “hunky dory”. This American-coined adjective has been around since the 1860s, from the now-obsolete hunkey, “all right,” which stems from the New York slang hunk, “in a safe position,” and the Dutch root honk or “home.” So basically you’re fine at home. (It is also a great album by David Bowie).

So even if you not OK, there are situations where we engage in some physical activity or mental endeavor where most want to know when “the end or finish line is near.” “OMG when is this going to END?!” However, one important factor that i have personally found when the distance is unknown and the end is unknown is when you truly find out who You are on this Earth. Sometimes the process is in the putting. Going way past what you thought possible is where the magic happens.

Uncertainty is the basis for which we live. Technically if you think about it we live next to an exploding star in an ever-expanding universe.

As a comparison here is a chart of Alan Chamberlin of JPL/Caltech. The subject matter is near-earth asteroids. Interestingly enough the objects were not previously known and thus there was no early warning. We knew when we knew. Take your time and let this chart sink into the old wetware.

This list does not include any of the hundreds of objects that collided with Earth, which were not discovered in advance, but were recorded by sensors designed to detect the detonation of nuclear devices. Of the objects so detected, 78 had impact energy greater than that of a 1-kiloton device (equivalent to 1000 tons of TNT), including 11 which had impact energy greater than that of a 10-kiloton device i.e. comparable to the atomic bombs used in the Second World War.

Why am i posting these statistics? Entropy until we know a universe that runs in reverse time or loops time it is ever increasing thus randomness and uncertainty writ large are always increasing.

So what does one do? Well we can only truly deal with what we can control. Anything exterior to that is superfluous in our thought patterns or should be extraneous and superfluous.

I wrote a related blog entitled “What Is It You Want?” which posited a different view. First, you need to decide what you want and if you are unable to decide what you want there is a good chance that what you don’t want is a more known entity.

Envision the worst thing that could happen. Fired from your job? Bankrupt? No. Try Grief. True loss as in the death of a loved one (including animals). Someone that will never ever return. Finality. We have all been there with someone.

In the same psychological realm True Loss can also be the other side of an “Aha!” moment. For instance, let’s say you have been working for months possibly years on something creative or an idea and suddenly in an ephemeral flash of lucidity you finally arrive at the ideation incarnate and grasp the totality of understanding. At that moment on the other side, you can no longer duplicate that feeling. It is gone. True Loss. For some that do not have family or friends, this is equivalent. Even for those that have family, friends and beloved animals, this is equivalent.

Yet we want to think in absolutes that someone somewhere will make it “OK”.

I say unto you: one must still have chaos in oneself to be able to give birth to a dancing star. I say unto you: you still have chaos in yourselves.

Zarathustra

Well as we say here in The South. “It’s OK until It Aint.” The only remedy is preparation but not analysis to paralysis. Train and train well.

The only other remedy is just GO and DO and move and make it happen. Only move in the direction of what YOU desire and want – it will make a huge difference.

The most intelligent humans, like the strongest, find their happiness where others find only disaster: in the labyrinth, in being hard with themselves and others, in effort, their delight is self mastery; in them asceticism becomes second nature, a necessity, an instinct.  They regard a difficult task as a privilege; it is to them a recreation to play with burdens that would crush all others.

R.A.C.

Humans are designed via the evolutionary process to work under stress. i believe we operate better when we are under the gun, under a deadline, reaching for goals, or striving for whatever it is that drives Us. Some prefer the sameness that comes with a 9-5 job. For me personally, i like to obtain order out of chaos.

So whenever you find yourself worried about what May-Be-Happening think instead of what IS-To-Be and make it happen. Alan Watts famously discussed the issue with no surprises in life is you know how everything turns out and if that is the case then that is living in the past.

Until Then,

#iwishyouwater <- Mavericks in Half Moon Bay doing its thing 3.22

@tcjr

Muzak To Blog By: Jeff Buckley’s Album “You and I”. Huge that this was all demos. Astounding talent left us to soon. He took a night swim in the Mississippi River and hit by a paddwheel boat. Random? Entropy? Safe? His autopsy showed no drugs or alchohol in a body.

Snake_Byte[6] Algorithm Complexity

The Lighter Side of Complexity - The Complexity Project
Your software design?

Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius and a lot of courage to move in the opposite direction.

E.F. Schumacher

First, i hope everyone is safe.

Second, i had meant this for reading over Thanksgiving but transparently I was having technical difficulties with \LATEX rendering and it appears that both MATHJAX and native LATEX are not working on my site. For those interested i even injected the MATHJAX code into my .php header. Hence i had to rewrite a bunch of stuff alas with no equations. Although for some reason unbenowst to me my table worked.

Third, Hey its time for a Snake_Byte [] !

In this installment, i will be discussing Algorithm Complexity and will be using a Python method that i previously wrote about in Snake_Byte[5]: Range.

So what is algorithm complexity?  Well, you may remember in your mathematics or computer science classes “Big Oh” notation.  For those that don’t know this involves both space and time complexity not to be confused with Space-Time Continuums.  

Let’s hit the LazyWeb and particularly Wikipedia:

“Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. It is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others collectively called Bachmann–Landau notation or asymptotic notation.”

— Wikipedia’s definition of Big O notation

Hmmm.   Let’s try to parse that a little better shall we?

So you want to figure out how slow or hopefully how fast your code is using fancy algebraic terms and terminology.  So you want to measure the algorithmic behavior as a function of two variables with time complexity and space complexity.  Time is both the throughput as well as how fast from t0-tni1 the algorithm operates.  Then we have space complexity which is literally how much memory (either in memory or persistent memory) the algorithms require as a function of the input.  As an added bonus you can throw around the word asymptotic:

From Dictionary.com

/ (ˌæsɪmˈtɒtɪk) / adjective. of or referring to an asymptote. (of a function, series, formula, etc) approaching a given value or condition, as a variable or an expression containing a variable approaches a limit, usually infinity.

Ergo asymptotic analysis means how the algorithm responds “to” or “with” values that approach ∞.

So “Hey what’s the asymptotic response of the algorithm?”

Hence we need a language that will allow us to say that the computing time, as a function of (n), grows ‘on the order of n3,’ or ‘at most as fast as n3,’ or ‘at least as fast as n *log*n,’ etc.

There are five symbols that are used in the language of comparing the rates of growth of functions they are the following five: ‘o’ (read ‘is little oh of’), O (read ‘is big oh of’), ‘θ’ (read ‘is theta of’), ‘∼’ (read ‘is asymptotically equal to’ or, irreverently, as ‘twiddles’), and Ω (read ‘is omega of’). It is interesting to note there are discrepancies amongst the ranks of computer science and mathematics as to the accuracy and validity of each. We will just keep it simple and say Big-Oh.

So given f(x) and g(x) be two functions of x. Where each of the five symbols above are intended to compare the rapidity of growth of f and g. If we say that f(x) = o(g(x)), then informally we are saying that f grows more slowly than g does when x is very large.

Let’s address the time complexity piece i don’t want to get philosophical on What is Time? So for now and this blog i will make the bounds it just like an arrow t(0) – t(n-1)

That said the analysis of the algorithm is for an order of magnitude not the actual running time. There is a python function called time that we can use to do an exact analysis for the running time.  Remember this is to save you time upfront to gain an understanding of the time complexity before and while you are designing said algorithm.

Most arithmetic operations are constant time; multiplication usually takes longer than addition and subtraction, and division takes even longer, but these run times don’t depend on the magnitude of the operands. Very large integers are an exception; in that case, the run time increases with the number of digits.

So for Indexing operations whether reading or writing elements in a sequence or dictionary are also constant time, regardless of the size of the data structure.

A for loop that traverses a sequence or dictionary is usually linear, as long as all of the operations in the body of the loop are constant time.

The built-in function sum is also linear because it does the same thing, but it tends to be faster because it is a more efficient implementation; in the language of algorithmic analysis, it has a smaller leading coefficient.

If you use the same loop to “add” a list of strings, the run time is quadratic because string concatenation is linear.

The string method join is usually faster because it is linear in the total length of the strings.

So let’s look at an example using the previous aforementioned range built-in function:

So this is much like the linear example above: The lowest complexity is O(1). When we have a loop:


k = 0
for i in range(n):
    for j in range(m):
        print(i)
        k=k+1

In this case for nested loops we multiply the time complexity thus O(n*m). it also works the same for a loop with time complexity (n) we call a function a function with time complexity (m). When calculating complexity we omit the constant regardless if its execution 5 or 100 times.

When you are performing an analysis look for worst-case boundary conditions or examples.

Linear O(n):

for i in range(n):
 if t[i] == 0:
   return 0
return 1

Quadratic O(n**2):

res = 0
for i in range (n):
   for in range (m):
      res += 1
return (res)

There are other types if time complexity like exponential time and factorial time. Exponential Time is O(2**n) and Factorial Time is O(n!).

For space complexity memory has a limit especially if you have ever chased down a heap allocation or trash collection bug. Like we said earlier there is no free lunch you either trade space for time or time for space. Data-driven architectures respond to the input size of the data. Thus the dimensionality of the input space needs to be addressed. If you have a constant number of variables: O(1). If you need to declare an array like using numpy for instance with (n) elements then you have linear space complexity O(n). Remember these are independent of the size of the problem.

For a great book on Algorithm Design and Analysis i highly recommend:

The Algorithm Design Manual by Steven S. Skiena (click it takes you to amazon)

It goes in-depth to growth rates and dominance relations etc `as it relates to graph algorithms, search and sorting as well as cryptographic functions.

There is also a trilogy of sorts called Algorithms Unlocked and Illuminated by Roughgarden and Cormen which are great and less mathematically rigorous if that is not your forte.

Well, i hope this gave you a taste. i had meant this to be a much longer and more in-depth blog however i need to fix this latex issue so i can properly address the matters at hand.

Until then,

#iwishyouwater <- Alexey Molchanov new world freedive record. He is a really awesome human.

Muzak To Blog By: Maddalena (Original Motion Picture Soundtrack) by the Maestro Ennio Morricone – Rest in Power Maestro i have spent many hours listening to your works.

Snake_Byte[5]: Range

Now… We are going in a loop.

~ Ramakrishna, Springs of Indian Wisdom
1K+ Loop Pictures | Download Free Images on Unsplash
Loops All The Way Down

First, i trust everyone is safe.

Second, i’ll will be moving the frequency of Snake_Bytes [] to every other Wednesday. This is to provide higher quality information and also to allow me space and time to write other blogs. i trust dear reader y’all do not mind.

Third, i noticed i was remiss in explaining a function i used in a previous Snake_Byte [ ] that of the Python built-in function called range.

Range is a very useful function for, well, creating iterations on variables and loops.

# lets see how this works:
range(4)
[0,1,2,3]

How easy can that be?

Four items were returned. Now we can create a range or a for loop over that list – very meta huh?

Please note in the above example the list starts off with 0. So what if you want your range function to start with 1 base index instead of 0? You can specify that in the range function:

# Start with 1 for intial index
range (1,4)
[1,2,3]

Note the last number in the index in order to be inclusive for the entire index.

Lets try something a little more advanced with some eye candy:

%matplotlib inline
x_cords = range(-50,50)
y_cords = [x*x for x in x_cords]

plt.plot(x_cords, y_cords)
plt.show()

X^2 Function aka Parabola

We passed a computation into the loop to compute over the indices of range x in this case.

In one of the previous Snake_Bytes[] i utilized a for loop and range which is extremely powerful to iterate over sequences:

for i in range (3):
    print(i,"Pythons")
0 Pythons
1 Pythons 
2 Pythons

For those that really need power when it comes to indexing, sequencing and iteration you can change the list for instance, as we move across it. For example:

L = [1,2,3,4,5,6]
#no add one to each row 
# or L[1] = L[i] +1 used all 
# the time in matrix operations
for i in range(len(L)): 
    L[i] += 1
print (L)
[2,3,4,5,6,7]

Note there is a more “slick” way to do this with list comprehension without changing the original list in place. However, that’s outside the scope if you will of this Snake_Byte[] . Maybe i should do that for the next one?

Well, i hope you have a slight idea of the power of range.

Also, i think this was more “byte-able” and not tl;dr. Let me know!

Until Then,

#iwshyouwater <- another good one here click!

@tctjr

Muzak To Blog By: Roger Eno & Brian Eno – Mixing Colors (this album is spectacular)