Hello Multi-Worlds With IBM Q

“If you are not completely confused by quantum mechanics, you do not understand it.”
~ John Wheeler

A chandelier or computing device?

Introduction

i wanted to take advantage of the #socialdistancing to catch up on personal blog writing. One of the areas that i have been meaning to start is my sojourn into the area of Quantum Computing specifically with IBM Q framework Qiskit (pronounced KIZ-KIT). Qiskit is an open-source quantum computing software development framework for leveraging today’s quantum processors in research, education, and business. Having read many of the latest texts (which i will add at the end of the blog) as well as initially implementing some initial Hello_World python scripts i decided to put it away due to the fact it made Alice In Wonderland’s Rabbit hole look tame. I did, however, go through some of the initial IBM Learnings and received the following:

Quantum
I am Bonafide

So given that i decided to fully re-engage and start the process the first steps as with any language or framework is to create the proverbial “Hello_World”. However, before we get into the code lets address what is in the Qiskit coding framework.

The following components are within the Qiskit framework: Terra, Aer, Aqua, and Ignis:

  • Terra: Within Terra is a set of tools for composing quantum programs at the level of circuits and pulses, optimizing them for the constraints of a particular physical quantum processor, and managing the batched execution of experiments on remote-access backends.
    • User Inputs (Circuits, and Schedules), Quantum Circuit, Pulse Schedule
    • Transpilers and optimization passes
    • Providers: Aer, IBM Quantum, and Third Party
    • Visualization and Quantum Information Tools (Histogram, State, Unitary, Entanglement)
  • Aer : It contains optimized C++ simulator backends for executing circuits compiled in Qiskit Terra and tools for constructing highly configurable noise models for performing realistic noisy simulations of the errors that occur during execution on real devices.
    • Noise Simulation (QasmSimulator Only)
    • Backends ( QasmSimulator, StatevectorSimulator, UnitarySimulator)
    • Jobs and Results: Counts, Memory, Statevector, Unitary, Snapshots
  • Aqua: Libraries of cross-domain quantum algorithms upon which applications for near-term quantum computing can be built. Aqua is designed to be extensible and employs a pluggable framework where quantum algorithms can easily be added.
    • Qiskit Aqua Translators ( Chemistry, AI, Optimization, Finance )
    • Quantum Algorithms ( QPE, Grover, HHL, QSVM, VQE, QAOA, etc… )
    • Qiskit Terra ( Compile Circuits)
    • Providers: Aer, IBM Quantium and Third Party
  • Ignis: A framework for understanding and mitigating noise in quantum circuits and systems. The experiments provided in Ignis are grouped into the topics of characterization, verification and mitigation.
    • Experiments: List of Quantum Circuits and Pulse Schedules
    • Qiskit Terra: Compile Circuits or Schedules
    • Providers: Qiskit Aer, IBM Quantum, Third Party
    • Fitters / Filters: Fit to a Model/Plot Results, Filter Noise

As one can see the components are cross-referenced across the entirety of the framework and provide the quantum developer a rich set of tools, algorithms, and methods for code creation.

Putting Your Toe In The First Quantum World

This section covers very basic quantum theory. There are several great textbooks on this subject and i will list some at the end of the blog with brief reviews. Suffice to say you cannot be scared or shy away from “greek letters or strange symbols”. To fully appreciate what is happening you need “the maths”. That said let us first define a qubit. Classical Computers operate on ( 0 ) or ( 1 ). Complete binary operations due to the nature of a diode or gate. Quantum Computers operate on quBits for Quantum Bits. These are represented by surrounding a name by ” | ” and ” > “. Thus a Qubit “named” “1” can be written as \(| 1\rangle\). This notation is known as Dirac’s bra-ket notation. Specifically from a mathematical standpoint and this is why the above uses the label “named” it is represented by a two-dimensional vector space over complex numbers \(\mathbb{C}^2\). This means that a Qubit takes two complex numbers to fully describe it. Okay so think about that… It takes two numbers to describe the state. Already strange huh? The computational (or standard) basis corresponds to the two levels \(|0\rangle\) and \(|1\rangle\), which corresponds to the following vectors: $$\begin{split}|0\rangle = \begin{pmatrix}1\\ 0 \end{pmatrix}~~~~|1\rangle=\begin{pmatrix}0\\1\end{pmatrix}\end{split}$$ So remember that the state is described by two complex numbers. Well, the qubit does not always have to be in either \(|0\rangle\) or \(|1\rangle\) ; it can be in an arbitrary quantum state, denoted \(|\psi\rangle\), which can be any superposition \((|\psi\rangle\ = \alpha|0\rangle + \beta|1\rangle\) of the basis vectors. The superposition quantities \(\alpha\) and (\beta\) are complex numbers; together they obey \(|\alpha|^2 + |\beta| = 1 \) . Interesting things happen when quantum systems are measured, or observed. Quantum measurement is described by the Born rule. In particular, if a qubit in some state \(|\psi\rangle\), is measured in the standard basis, the result 0 is obtained with probability \(|\alpha|^2\), and the result 1 is obtained with the complementary probability \(|\beta|^2\). Interestingly, a quantum measurement takes any superposition state of the qubit, and projects it to either the state \(|0\rangle\) or the state \(|1\rangle\), with a probability determined from the parameters of the superposition. Whew! What i found really cool was that all of the linear algebra is the same. Here is another really cool thing: To actually create the environment the amazing scientists at IBM In the IBM Quantum Lab keep the temperature cold (15 milliKelvin in a dilution refrigerator) that there is no ambient noise or heat to excite the superconducting qubit. It is beyond the scope of why this is needed but suffices to say it involves making a superconductor, and that is when a material conducts electricity without encountering any resistance, thus without losing any energy. Ok, let’s climb out of Alice’s Rabbit Hole and get to some practical code.

Setting Up The Environment

So we are assuming the reader is familiar with setting up a python virtual environment and able to either pip install or utilize a package manager like anaconda for installing the respective libraries. The complete installation process can be found here: Installing QisKit. For completeness, i will duplicate the cogent items in the following sections. i’ll also be posting a Juypyter Notebook to github.

The simplest way to use environments is by using the conda command, included with Anaconda. A Conda environment allows you to specify a specific version of Python and set of libraries. Open a terminal window in the directory where you want to work.

Create a minimal environment with only Python installed in it.

conda create -n name_of_my_env python=3 
source activate name_of_your_env

Next, install the Qiskit package, which includes Terra, Aer, Ignis, and Aqua. ( in this writeup i will only focus on the very basics. i will get to the others in later posts! )

pip install qiskit

NOTE: Starting with Qiskit 0.13.0 pip 19 or newer is needed to install qiskit-aer from precompiled binary on Linux. If you do not have pip 19 installed you can run pip install -U pip to upgrade it. Without pip 19 or newer this command will attempt to install qiskit-aer from sdist (source distribution) which will try to compile aer locally under the covers.

If the packages installed correctly, you can run conda list to see the active packages in your virtual environment.

There are some optional packages i suggest installing for really cool circuits visualizations and like that work in conjunction with matplotlib. You can install these optional dependencies by with the following command:

pip install qiskit-terra[visualization]

To check if everything is running hop into the python prompt and type:

import Qiskit

Getting an IBM Q account and API Key

Next, you will need to register for an IBM Q account. Click this link -> Register For IBM Q Account

Here is link just in case:

https://quantum-computing.ibm.com/

IBM Q allows you to interface directly with IBM’s remote quantum hardware and quantum simulation devices. You can execute code locally on a quantum simulator however getting access to the hardware and understanding how noise affects the circuits and measurements are crucial in understanding quantum algorithm development. As with any remote system you need to lock it to an API Key. When you login you will see the following:

Generate the API token and then click on Copy API Token to copy your API Token and place into into your Jupyter Notebook. I recommend using JupyterLab Credential Store for these types of tokens and login credentials. We will come back to using the API Key so dont misplace it!

So i am assuming you made it this far and have your venv activated and your Jupyter Lab / Notebook up and running.

Check your installation by performing the following. It should print out the latest version. Also run the following commands to store your API token locally for later use in a configuration file called qiskitrc. Replace MY_API_TOKEN with the API token value that you stored in your text editor or Jupyter Notebook. Note this method saves the credentials and token to disc. It is a matter of taste you can choose in session usage as well. These are some standard imports.

%matplotlib inline
import numpy as np
from qiskit import * 
from qiskit import IBMQ
from qiskit.tools.visualization import plot_histogram
qiskit.__version__
qiskit.__qiskit_version__

IBMQ.save_account('MY_API_TOKEN') # THIS IS YOUR API KEY FROM EARLIER!

[1]: 0.12.0

i appear to be up to date.

Next you want to make sure you are up to date on the latest versioning of the platform. Since November 2019 (and with version 0.4 of this qiskit-ibmq-provider package), the IBM Quantum Provider only supports the new IBM Quantum Experience, dropping support for the legacy Quantum Experience and Qconsole accounts. The new IBM Quantum Experience is also referred to as v2, whereas the legacy one and Qconsole as v1.

IBMQ.update_account()

Depending on your credentials you will either get a listing of updating credentials or that you are up to date.

IBM Q has various backends to run your code upon. The default is a full-fledged simulator that is invoked locally which is very convenient. The next invocation method is via direct quantum computing hardware access. i must say it is astounding that one can access via open-source quantum computing resources.

By default, all IBM Quantum Experience accounts have access to the same, open project (hub: ibm-q, group: open, project: main). For convenience, the IBMQ.load_account() and IBMQ.enable_account() methods will return a provider for that project. If you have access to other projects, you can use:

provider_2 = IBMQ.get_provider(hub='MY_HUB', group='MY_GROUP', project='MY_PROJECT')

i used the following to check out the available backends that are available. Note: The name is just a name – not the location of the hardware:

provider = IBMQ.get_provider(group='open')
provider.backends()
[10:] [<IBMQSimulator('ibmq_qasm_simulator') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmqx2') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_16_melbourne') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_vigo') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_ourense') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_london') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_burlington') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_essex') from IBMQ(hub='ibm-q', group='open', project='main')>,
 <IBMQBackend('ibmq_armonk') from IBMQ(hub='ibm-q', group='open', project='main')>]

Running Your First Circuits

There are several ways to run your first circuits. There is online access via in place Jupyter Notebooks as well as a visual circuit designer called IBM Circuit Composer which you can access via your IBM Q account. i will be describing steps using python code and direct Qiskit usage due to flexibility, transparency, and granularity over the environment. This will set it to the 'ibmq_qasm_simulator'

my_provider = IBMQ.get_provider()
my_provider.backends()
my_provider.get_backend('ibmq_qasm_simulator')

So some terminology registers are used to create circuits. Circuits act upon registers. Now lets actually look at some code that generates some registers as well as a quantum circuit:

Here we a script that starts off with an input of 2 quantum “0” bits There is no action before it outputs a classical equivalent of bits:

So if you run this you will get the output:

Total count for 00 and 11 are: {'00': 517, '11': 483}

Here is what is happening:

  • QuantumCircuit.h(0): A Hadamard gate 𝐻on qubit 0, which puts it into a superposition state.
  • QuantumCircuit.cx(0, 1): A controlled-Not operation (𝐶𝑋) on control qubit 0 and target qubit 1, putting the qubits in an entangled state.
  • QuantumCircuit.measure([0,1], [0,1]): if you pass the entire quantum and classical registers to measure, the ith qubit’s measurement result will be stored in the ith classical bit.
Your First Quantum Circuit

So this is an ASCII printout. i was really impressed when i found out this tidbit. You can also pass in “mpl” for matplotlib or “latex” for full on latex beautification!

circuit.draw("mpl") and circuit.draw("latex")
Matplotlib representation of the circuit

NOTE: The latex and latex_source drawers need pylatexenc installed. Run "pip install pylatexenc" before using the latex or latex_source drawers. Professor Donald Knuth will be pleased.

#Plot a histogram
plot_histogram(counts)
Histogram showing the probability of results

The observed probabilities 𝑃𝑟(00) and 𝑃𝑟(11) are computed by taking the respective counts and dividing by the total number of shots.

Next Steps

So this is just a small step into the world of quantum programming. Below i have included several resources for study. If you are interested in pursuing this area i do urge you to take your time. i hope this at least gives you a perspective and provides a vehicle for entry. i personally feel completely humbled every time i start to read or re-read something in this area. Quantum computing is going to change the way view our world. i for one will be going deeper in this area as far as i am intellectually capable of taking the process.

NOTE: This the title of this blog refers to the theory of Minowski Multi-Worlds with a pun on Hello_World. The many-worlds interpretation implies that there is a very large—perhaps infinite number of universes. It is one of many multiverse hypotheses in physics and philosophy. MWI views time as a many-branched tree, wherein every possible quantum outcome is realized.

Resources

IBM Q User Guides All of the official IBM Q User Guides – very comprehensive.

IBM Q Wikipedia – A good readers digest of the history of IBM Q

The IBM Quantum Experience – the entry and dashboard experience

IBM Q online book – an amazing interactive experience covers everything from physics, linear algebra to code.

Mastering Quantum Computing with IBMQX – a great practical well-written book on how to get your hands coding on IBM Q

Dancing with Qubits – Written by Dr Bob Sutor of IBM a wonderful text on the mathematics and processes of quantum computing

Practical Quantum for Developers – a multi-disciplinary book that covers all aspects of coding for quantum from python, apis, cryptography, and even game theory.

Quantum Computing – A Gentle Introduction – this book covers the fundamentals of quantum computing in a very pragmatic fashion and focuses on the mathematical aspects.

Quantum Algorithms via Linear Algebra – the title is the content. ready set Linear Algebra – its the same stuff only quantum!

Quantum Computing for Computer Scientists – very close to the Gentle Introduction text however it covers the theory in-depth and also goes over several different types of algorithms.

Minowski Multi Worlds – The many-worlds interpretation implies that there is a very large perhaps infinite number of universes. It is one of many multiverse hypotheses in physics and philosophy. MWI views time as a many-branched tree, wherein every possible quantum outcome is realized. This is intended to resolve some paradoxes of quantum theory, such as the EPR paradox and Schrödinger’s cat since every possible outcome of a quantum event exists in its own universe. If you ask i’ll say the cat is dead.

Until then,

#IWishYouWater

tctjr

COVID-19 Complexity Relationships

As most are probably aware and i hope that you are at this point COVID-19 appears to be a very serious worldwide concern. From a complexity systems relationship standpoint there are several interesting aspects here that for some might be self-evident and for others might not be so self-evident. First, let us start with some observations concerning health and wellness in general:

Your wellness and health are the most important aspect of your life:

  • by definition, it is distributed
  • by definition, it affects others – eg its networked
  • by definition, it involves proximity – human caring and empathy

Given that it is a networked system and can have very non-linear behaviors. i was just having a discussion of an issue that could have a great effect (and affect) upon seemingly unrelated entities. Paper money is a fragile medium and also can carry chemicals and pathogens. Of interest:

The World Health Organization (WHO) has advised people to wash their hands and stop using cash if possible as the paper bills may help spread coronavirus.

here is the link:

https://www.ktvu.com/news/contaminated-cash-may-spread-coronavirus-world-health-organization-warns

The other happening is large corporations are canceling travel and conferences.

This brings me to the non-linear relationships which are two-fold (for now) but there will be several others: (1) cryptocurrency usage will skyrocket (2) “De-Officing” will start a trend in remote telecommuting work which will cause teleconferencing companies stock to increase.

Just some observations.

Until then,

Be safe and I wish You Water.

@tctjr

NuerIPS 2019

And they asked me how I did it, and I gave ’em the Scripture text,
“You keep your light so shining a little in front o’ the next!”
They copied all they could follow, but they couldn’t copy my mind,
And I left ’em sweating and stealing a year and a half behind.

~ “The Mary Gloster”, Rudyard Kipling, 1896

My Badge – I exist.

Well, your humble narrator finally made it to NuerIPS2019. There were several starts and stops to my travel itinerary but I finally persevered!

Bienvenue – Vancouver, British Columbia

First and foremost while the location at least for me required multiple hops Vancouver, BC is a beautiful city. The Vancouver conference center is spacious and an exemplary venue. Also for those that have the time Whistler / Blackcomb is one of the best mountains in North America for snow sports this time of the year. While I didn’t get to go I am being hopeful that I will win the registration the lottery system next year for 2020 and will plan accordingly.

Vancouver Conference Center – Oh Canada!

This year the conference was veritable who’s who of information-theoretic companies. Most of the top market cap companies are now information theoretic-based technology companies and as such have representation here at the conference. To wit IBM Research AI was a diamond sponsor:

While it is nearly impossible to quantify the breadth and depth of the subject matter presented here at the conference I have attempted to classify some overall themes:

  • Agent-Based Modelling and Behaviors
  • Imitation, Meta, Transfer, Policy Learning and Behavioral Cloning
  • Morphological Systems based on Evolutionary Biology
  • Optimization methods for non-convex models
  • Hybrid Bayesian and MCMC methods
  • Ordinary Differential Equation (ODE) direct Modelling and Systems
  • Neuroscience models that couple computational agents and hypotheses of consciousness

Side Note: I think it is amazing that 10 years ago you could not say “I’m using a Neural Network for …” without being laughed out the room. Now there is an entire set of tracks dedicated to said technology and algorithms.

The one major difference in this conference compared to what I have read and heard albeit second hand or through reports or blogs is the focus on ‘Where is your github?” and the question of how fast can we get to production? There was a very focused and volitional undertone to the questions

One aspect that has not changed and appears to have been amplified is the recruiter/job marketplace and (ahem) situation at the conference. To say that it was transparent and out in the open would be an understatement.

New To NeurIPS:

For those that have never been to neurips I’ll provide some recommendations:

  • Download the conference app and fill out your profile
  • Plan your agenda
  • Get to the poster sessions – early
  • Network as much as possible
  • Wear comfortable shoes – it is in the same venue next year, lots of walking.
  • Attempt to get a close hotel as possible due to \(P(Rain | Conference Timing) > 0.5\)

Trends and Catagories:

Agent-Based Modelling and Behaviors

This area is finally coming to fruition in the production market at scale. We are seeing both ABB (agent based modeling) and ABM (agent-based modeling aka self emergent / self organizing behaviors). There were many presentations on multi-agent behaviors in the context of both policy and environment responses using reinforcement learning and q-learning.

Imitation, Meta, Transfer, Policy Learning and Behavioral Cloning

I grouped all of these together while technically they are different in application and scope. However, they can and are mixed together for applied systems. For instance in imitation learning (IL). IL instead of trying to learn from the sparse rewards or manually specifying a reward function, an expert (typically a human) provides us with a set of demonstrations. The agent then tries to learn the optimal policy by following, imitating the expert’s decisions. Historically this was called Expert Systems Engineering. However, note the policy learning implicit in this area as well. Furthermore Behavioral cloning is a method by which human subcognitive skills can be captured and reproduced in a computer program. As the human subject performs the skill, his or her actions are recorded along with the situation that gave rise to the action. So as one can see all of these areas are closely related to a so-called expert reference. Algorithms of consensus among multi-agents will play a crucial role here.

Morphological Systems based on Evolutionary Biology

Morphology is a branch of biology dealing with the study of the form and structure of organisms and their specific structural features. Morphology is a branch of life science dealing with the study of a structure of an organism and its component parts. Turing wrote a paper on Morphology and S. Kaufman wrote “The Origins of Order: Self-Organization and Selection in Evolution” just to name a few. We are headed into areas where physics, chemistry, and biology are being brought into play with computing, once again at scale. This multi-modality computing will also benefit from access to the developments in accessible quantum computing.

Optimization methods for non-convex models

Gradient descent in all of its flavors has been our friend for decades. Are the local minima our friend or foe? The algorithms are now starting to ask “Where Am I”?

Hybrid Bayesian and MCMC methods

In 2007 I founded a machine learning and NLP as a service company called “BeliefNetworks”. This self-referencing name should illustrate where I stand on inference methods. Due to access to cycles and throughput, we are finally starting to see these methods integrated system-wide.

Ordinary Differential Equation (ODE) direct Modelling and Systems

Having worked for years in the areas of numerical optimization this is another area that is near and dear. I saw several papers mapping ODE’s to geometric representations. Analog computing could very well be in our return to the future. Naiver-Stokes equation anyone? I see the industry moving into flow models with truly modeling foundational Cauchy momentum equations depending on the application area. We are going to see both software and hardware development in this area.

Neuroscience models that couple computational agents and hypotheses of consciousness

Given all of the above computer scientist are pulling in physicists, biologists, chemists and finally neuroscientists-finally. Possibly the “C” word is no longer anathema? I promise I will not insert a terminator picture here. However, given the developments in cognition and understanding quantum biology, we are now starting to be able to model at least initially what we “think” we are thinking about in some cases. Yoshua Bengio gave a great talk on volitional causal and “conscious” tasks easily accomplished by humans. We also see this with the developments in the areas of spiking algorithms.

Papers, Posters, Demos – Oh My!

As part of this blog, I wanted to review a couple of my favorite presentations, posters, and papers. While this is not a ranked list nor is it a temporal chronological review it is a list of papers that resonated with me for various reasons. While I will be listing papers I will also be posting pictures of poster papers and some meetups that I attended.

Blind Super-Resolution Kernel Estimation using an Internal-GAN

This paper was interesting to me on several fronts. The basic premise for super-resolution kernels are thus: $$ILR = (I{_H}{_R}∗ks)↓_S$$ The paper introduced “KernelGAN” – an image-specific internal-GAN, which estimates the SR kernel that best preserves the distribution of patches across scales of the LR image. This is what I would consider significant progress over previous methods by estimating an image-specific SR-kernel based on the LR image alone. This allows a one-shot mode for training based on the LR image. Network training is done during test time. There is no actual inference step since the training implicitly contains the resulting SR-kernel. They give results in the paper as well a metrics of performance based on NTIRE 2018 dataset although given the first application of a deep linear network I would imagine this doesn’t really do it justice. Very impressive and I can see several applications of this method and algorithm.

Project website: http://www.wisdom.weizmann.ac.il/∼vision/kernelgan

q-means: A Quantum Algorithm for Unsupervised Machine Learning

The cogent aspect of this paper was the efficiency of storing the vectors in First, classical data expressed in the form of N-dimensional complex vectors can be mapped onto quantum states over \(log2Nqubits\): when the data is stored in a quantum random access memory (qRAM). Specifically, the distance estimation becomes very efficient when having quantum access to the vectors and the centroids via qRAM. The optimization yields a k-means optimization $$T=O(log(d))$$further the paper showed that you can also query the norm of the vectors within the state preparation.

Making AI Forget You: Data Deletion in Machine Learning

One of the issues with GDPR legislation and the right to be forgotten comes up when you must re-train the entire data set. This paper addresses methodologies that enable partial re-training. The paper goes over past methods of cryptography and differential privacy of which do not delete data but attempt to make data private or non-identifiable. From the paper: “Algorithms that support efficient deletion do not have to be private, and algorithms that are private do not have to support efficient deletion. To see the difference between privacy and data deletion, note that every learning algorithm supports the naive data deletion operation of retraining from scratch. The algorithm is not required to satisfy any privacy guarantees. Even an operation that outputs the entire dataset in the clear could support data deletion, whereas such an operation is certainly not private.” The paper goes on to define four areas of metric performance for DDIML: Linearity, Laziness, Modularity, and Quantization. They do state that e also assumed that user-based deletion requests correspond to only a single datapoint and this needs to be extended. However, for the unsupervised k-means they describe they have deletion efficiency with substantial algorithm speedup.

paper here: https://arxiv.org/pdf/1907.05012.pdf

Casual Confusion in Imitation Learning

From Wikipedia: “Behavioral cloning is a method by which human sub-cognitive skills can be captured and reproduced in a computer program. As the human subject performs the skill, his or her actions are recorded along with the situation that gave rise to the action.” The fundamental premise was comparing expert versus computational policy and minimizing a graph-based approach: $$\mathbb{E}_G[ \mathcal {l}(fφ([X_i \bigodot\ G,G]),Ai)]$$ where \(G_i\) is drawn uniformly at random overall \(2^{n}\) graphs and optimize for the mean squared error loss for the continuous action environments and a cross-entropy loss for the discrete action environments. Something very interesting happens during this process of imitation learning with experts. In particular, it leads to a counter-intuitive “causal misidentification” phenomenon: access to more information can yield worse performance ergo more is not better! The paper discusses with demonstrations of an autonomous vehicle scenario of phases with targeted intervention to predict the graph behavior. They did state the solutions are not production-ready. I really appreciated the honesty.

paper: https://papers.nips.cc/paper/9343-causal-confusion-in-imitation-learning.pdf

Learning To Control Self Assembling Morphologies: A Study of Generalized via Modularity

The idea of modular and self-assembling agents goes back at least to Von Neumman’s Theory of Self-Reproducing Automata. In robotics, such systems have been termed “self-reconfiguring modular robots”. E. Schrödinger posed this same question in “What is Life?”. This was one of my favorite demonstrations and presentations. I have been extremely “pro” using agent base self-organizing algorithms for quite some time. This paper and presentation utilizes zero-shot generalization and trains policies and generalizes to changes in the number of limbs of the entity as well as the environment. They then pick the best model from training and evaluate it without any fine-tuning at test-time.

paper: https://arxiv.org/pdf/1902.05546.pdf

Quantum Wassertain GANs

The poster and paper dealt with supposedly the first design of quantum Wasserstein Generative Adversarial Networks (WGANs), which has been shown to improve the robustness and the scalability of the adversarial training of quantum generative on noisy quantum hardware. Parameterized quantum circuits These circuits can be used as a parameterized representation of functions as called quantum neural networks, which can be applied to classical supervised learning models, or to construct generative models. The paper also showed how to turn the quantum Wasserstein semimetrics into a concrete design of quantum WGANs that can be efficiently implemented on quantum machines. FWIW in functional analysis, pseudometrics often come from seminorms on vector spaces, and so it is natural to call them “semimetrics”. The paper used WGANs to generate a 3-qubit quantum circuit of 50 gates that approximated a 3-qubit simulation circuit that requires over 10k gates using off the shelf standard techniques. The QWGAN then can was used to approximate complex quantum circuits with smaller circuits. A smaller circuit was then trained to approximate the Choi–Jamiolkowski isomorphism or Choi state which encodes the action of a quantum circuit.

Deep Signature Transforms

Signatures refer to a set of statistics given a stream of data. The other type of signature is for the transform. Sometimes this is also called the transform kernel. In the case of a signal kernel or transform to model a curve as a linear combination. Signatures provide a basis for functions on the space of curves. These functions can then be used as operative building blocks. The stream can then be defined as: $$S(V) ={x= (x1,…,xn) :xi∈V,n∈N}$$ This also has interesting ramifications as a feature mapping/engineering processes as well as embedding the signatures within algorithms, in this case, a layer within a Neural Networks. This is akin to some fingerprinting techniques in the past for media and the paper does mention: “in order to preserve the stream-like nature is to sweep a one-dimensional convolution along the stream.” The embedding techniques as part of the path and preserving nature made this an extremely enjoyable discussion.

code here: https://github.com/patrick-kidger/Deep-Signature-Transforms

paper here: https://arxiv.org/pdf/1905.08494.pdf

Metamers Of Neural Networks

This paper was near and dear to me due to some of my past lives working in the areas of psychological and perceptual media models. Metamers are a psychophysical color match between two patches of light that have different sets of wavelengths. This means that metamers are two patches of color that look identical to us in color but are made up of different physical combinations of wavelengths. In the case of this paper for metamers they “model metamers” to test the similarity between human and artificial neural network representations. The group generated model metamers for natural stimuli by performing gradient descent on noise signal, matching the responses of individual layers of image and audio networks to a natural image or speech signal. The resulting signals reflect the invariances instantiated in the network up to the matched layer. As with most things in machine learning the team sought whether the nature of the invariances would be similar to those of humans, in which case the model metamers should remain human-recognizable regardless of the stage from which they are generated. In this case, the humans were divergent from the neural networks. We need more of this type of work and how perceptions affect machine learning outcomes or possibly priors?

paper here: https://papers.nips.cc/paper/9198-metamers-of-neural-networks-reveal-divergence-from-human-perceptual-systems.pdf

Weight Agnostic Neural Networks

I particularly enjoyed this poster and the commentary “Animals have innate abilities…” I also believe most of the animal kingdom is sentiment as well as operating on literally different wavelengths (spectrum etc). The paper was to demonstrate a method that can find minimal neural network architectures that can perform several reinforcement learning tasks without weight training. Ergo the title Weight Agnostic. In place of optimizing weights of a fixed network, they sought to optimize instead for architectures that perform well over a wide range of weights. When I walked up to the poster I immediately thought of Algorithmic Information Theory (AIT) and how soft weights have been used for neural networks. AIT which based using Kolmogorov complexity of a computable object is the minimum length of the program that can compute it. The paper goes into detail concerning The Minimal Description Length (MDL) of a program and the recent dusting off of these processes applied to larger deep learning nets. The poster did not reflect the transparency of the paper in that the research was very focused on creating generalized network architectures in which IMHO is a step toward AGI and stated the WANN is not approaching the performance of engineered CNNs. I also appreciated the overall frankness of the paper. Quote from the paper: “This paper is strongly motivated towards these goals of blending innate behavior and learning, and we believe it is a step towards addressing the challenge posed by Zador. We hope this work will help bring neuroscience and machine learning communities closer together to tackle these challenges.”

Interactive version of the paper here: https://weightagnostic.github.io/

Regular paper here: https://arxiv.org/pdf/1906.04358.pdf

Inducing Brain Relevant Bias in Natural Language Processing Models

This poster was part of a general theme that I saw throughout the conference. Utilizing medical imaging devices to create better canonical models for machine learning. The paper shows the relationship between language and brain activity learned by BERT (Bidirectional Encoder Representations from Transformers) during fine-tuning transfers across multiple participants. The paper goes on to show that, for some participants, the fine-tuned representations learned from both magnetoencephalography (MEG) and functional magnetic resonance imaging(fMRI) are better for predicting fMRI than the representations learned from fMRI alone, indicating that the learned representations capture brain-activity-relevant information that is not simply an artifact of the modality. The model predicts the fMRI activity associated with reading arbitrary text passages, well enough to distinguish which of two-story segments is being read with 74% accuracy. That is impressive and I believe we need more multi-modality papers of this nature and research.

Full site with paper data etc: http://www.cs.cmu.edu/~fmri/plosone/

A Robust Non-Clairvoyant Dynamic Mechanism for Contextual Auctions

This paper caught my eye as I spend a great deal of time researching agents in game-theoretic of mechanism design based situations. What really caught my eye was the terminology non-clairvoyant. I suppose if there was a method that was truly calirvoynet we wouldn’t be concerned with the robustness of said algorithms. Actually, it is a real definition – a dynamic mechanism is non-clairvoyant if the allocation and pricing rule at each period does not depend on the type distributions in the future periods. In many types of auctions, especially ad networks the seller must rely on approximate or asymmetric models of the buyer’s preferences to effectively set auction parameters such as a reserve price. In mechanism design, you essentially have three vectors of input: [1] collective decision problem, [2] measure of quality to evaluate any candidate solution, [3] description of the resources – information – held by the participants. The paper presented a learned policy model and framework that could be applied in phases and possibly extrapolated to other types of applications. I personally think dynamic mechanism design has great applicability in the areas of distributed computing and distributed ledger platforms.

I also attended the NASA Frontier Design Labs that was sponsored by Google, Intel and Nvidia. I was part of the NASA FDL AI Astronaut Health research project over the summer of 2019. The efforts, technology and most importantly the people are astounding. The event was standing room only and several amazing conversations on the various projects with NASA FDL were had at the event.

Machine Learning For Space

I do hope you will continue to visit my site. If you continue to visit you will notice I have a type of “disease” called Biblomaniac-ism. As such I bought a book at the conference:

The future is distributed

So there you have it. While this probably was tl;dr I hope you gave it a good scan while you were doing a pull request or two. I hope this has at least provided some insight into the conference.

\(\forall\) papers: https://papers.nips.cc/book/advances-in-neural-information-processing-systems-32-2019

Until Then,

#IWishYouWater

tctjr

Book Review: Future Shock

Future Shock Book Spine

One of the things, Oh Dear Reader, you will come to find out about me is that I have a disease called biblomaniacism. I argue however as with most things indulgence, not compulsion is the order of the day. However, I also argue if you are going to have a vice or let us say an issue as it were, then obsessive reading or collecting of books is not such a bad thing to have unless they fall on you or if you have to move them. I wanted to give the reader a full context for future meanderings in the realm of book reviews and general book discussions.

As of late, I have been having discussions on several fronts concerning the sharing economy and how transients and complexity add to the perception of less time in our lives. Humans also ask me to recommend books. Given these discussions, I have been recommending a book entitled “Future Shock” by Alvin Toffler. Here are the particulars:

Book Title: Future Shock
Author: Alvin Toffler
Publisher: Random House
ISBN: 0-394-42586-3 (Original hardcover)
Copyright: 1970

I have the original hardback version. I love the black cloth cover with the red letter embossed writing. I also love the perforated edges on the pages. The dedicated page is classic:

Dedication Page

The book’s premise is the presupposes that we as humans are moving into an area of “information overload” as far as I know this is the first mention of the terminology. Once again this book was published in 1970. The book argues that we as a society are facing enormous structural change, a revolution from an industrial society to a “super-industrial society”. As such the underlying delta in our perceptual makeup from moving to “atoms to bits” is that our sense of ownership and therefore our sense of time is greatly affected. The sense of ownership is affected by moving from having and owning to renting and sharing. The tome goes into great detail we are ever more transient in our behavior much in the same aspect our ancestors where nomadic. However, the major differentiation is that the cultural break from the past now comes at a price.

An excerpt from page 11:

“Future Shock is a time phenomenon a product of the greatly accelerated rate of change in our society. It arises from the superposition of a new culture on an old one. It is a culture shock in one’s own society. But its impact is far worse. For most Peace Corpsmen, in fact, most travelers, have the comforting knowledge that the culture they left behind will be there to return to. The victim of the future shock does not.”

The underlying thesis is that we as a society are processing more information in a shorter amount of time which results in all aspects of our being and relationships with life compressed and transient. For example, take an individual out of his/her own culture and set them down in an environment where there are different rules both written and unwritten which apply to conceptions of time, sex, religion, work, personal space and cut off from any hope of retreating back to a more familiar social landscape. This can be exacerbated if the culture has different value systems which it probably does then what is considered rational behavior under these circumstances for the individual? The book takes this view and applies it to entire societies and generations. Thus this incurs future shock on a massive scale.

One very cogent aspect that resonated with me is the concept of fetishizing anything and everything. The execution of this fetishization comes through the application of sub-cultures. Whereas any little modification results in a new genre of the individual with respect to the sub-culture. Maybe one reason this resonated with me was his illustration of surfers being a sub-culture. Toffler does an amazing job of mapping this sub-culture fetish to having styles automatically chosen for us whereas we thereby adopting the lifestyle without having to really perform the machinations associated with say paddling out in an ocean. If you adopt the style the percetion you are part of the culture is enough due to the transient nature of changing sub-cultures.

Toffler also goes into depth addressing the needs for our educational system especially k-12 needing to address thinking in the future instead of rank and file history which he does mention in most cases is variational and filtered as a function of the teacher’s belief system. He proposes a complete overhaul of the educational system on how we now have a static teaching agenda based on 17th-century rote memorization skills to a more adaptive system of learning. He also emphasizes how education will be more of a distributed individualized auto-didactic process. I consider myself to be an auto-didactic and relish the ability to sign up for Udemy or Coursera classes ergo I completely believe he nailed this assumption for the future classroom.

Oh, dear reader, if you made it this far fear not, this book is not a nihilistic or dystopian view of that which will inevitability come to pass. Toffler has a litany of suggestions for how we can overcome the future shock malaise or in fact he suggests it could be a new medical condition. I, however, will not list these in a cookbook fashion as I do not want to be a spoiler. Suffice to say we are seeing some people exercise their future thought to change future shock.

Caveat Emptor: This book will stretch and at the same time bind what you thought was good or bad for our western society. While you will probably pay a premium for the hardback original edition the paperback edition can easily be purchased for a very reasonable price. For those that work in the areas of dealing with humans or creating new technolgy I highly recommend adding this to your reading list. Your neurons will thank you for it.

If you happen to have read the book or are reading the book I would appreciate any comments you care to share.

Blogging Music: “Entre Dos Aquas” by Paco de Lucia, 1981.

Until then,

I wish you water,

tctjr.

Under Re-Construction!

Hey Y’all!

I decided to finally stand up a full site. Hosting on AWS and all that stuff. I am going to start writing on several different subjects.

Until Then Remember!

$$\ H(X) = -\sum p(X)\log p(X)\\Information\ Gain\; I(X,Y)= H(X)-H(X|Y)$$