{"id":2336,"date":"2018-07-12T13:57:22","date_gmt":"2018-07-12T12:57:22","guid":{"rendered":"https:\/\/www.quantum-bits.org\/?p=2336"},"modified":"2022-08-12T17:15:25","modified_gmt":"2022-08-12T16:15:25","slug":"on-quantum-computing-and-artificial-intelligence","status":"publish","type":"post","link":"https:\/\/www.quantum-bits.org\/?p=2336","title":{"rendered":"On Quantum Computing and Artificial Intelligence"},"content":{"rendered":"<div>\n<p>Mixing quantum computing and Artificial Intelligence (AI) may sound like a new buzzword. However, since quantum computing advances are hinting at profound changes in the very notions of computation, it is natural to reexamine various branches of computer science in the light of these disruptions. And &#8230; AI is no exception<\/p>\n<\/div>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-1612\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/03\/quantum-physics-formulas-over-blackboard.jpg\" alt=\"\" width=\"850\" height=\"332\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/03\/quantum-physics-formulas-over-blackboard.jpg 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/03\/quantum-physics-formulas-over-blackboard-300x117.jpg 300w\" sizes=\"(max-width: 850px) 100vw, 850px\" \/><\/p>\n<p>As usual, before entering the <em>quantum realm<\/em>, it is important to get an overview of the <em>classical world<\/em>.<\/p>\n<p><strong>AI, Machine Learning, Artificial Neural Networks &amp; Deep Learning<br \/>\n<\/strong><\/p>\n<p><strong>Artificial Intelligence<\/strong> is difficult to define. Probably because intelligence, by itself, is difficult to define. AI is often defined as the study of any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals. In day to day situations, the term&nbsp;AI applies when a machine perform tasks that normally require human intelligence, such as learning, reasoning, problem solving, decision-making, planning, visual perception, knowledge representation, natural language processing, competing at the highest level in&nbsp;strategic game&nbsp;systems or driving cars.<\/p>\n<p>The AI field draws upon&nbsp;computer science,&nbsp;mathematics,&nbsp;psychology,&nbsp;linguistics and many others. Approaches to AI include&nbsp;&nbsp;include&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence#Statistical\" target=\"_blank\" rel=\"noopener\">statistical methods<\/a>,&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence#Sub-symbolic\" target=\"_blank\" rel=\"noopener\">computational intelligence<\/a>, or <a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence#Symbolic\" target=\"_blank\" rel=\"noopener\">traditional symbolic AI<\/a>. To reach its goals, many tools are used in AI, including <a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence#Search_and_optimization\" target=\"_blank\" rel=\"noopener\">search and mathematical optimization<\/a>,&nbsp;<a title=\"Artificial neural network\" href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network\" target=\"_blank\" rel=\"noopener\">artificial neural networks<\/a>, and&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_intelligence#Probabilistic_methods_for_uncertain_reasoning\" target=\"_blank\" rel=\"noopener\">methods based on statistics, probability and economics<\/a>.&nbsp;<\/p>\n<p>Along its history, AI has seen many ups and downs. As of 2018, AI techniques have experienced a resurgence following concurrent advances in&nbsp;computer power, large amounts of&nbsp;data, and theoretical understanding. <strong>Nowadays, AI techniques have become an essential part of the industry<\/strong>.<\/p>\n<p><strong class=\"markup--strong markup--p-strong\">Machine learning<\/strong> (ML) was first defined by Arthur Samuel in 1959. It is a subset of artificial intelligence that uses statistical techniques to give computers the ability to <strong>learn<\/strong> (i.e., progressively improve performance on a specific task)<strong> with data<\/strong>, <strong>without being explicitly programmed<\/strong>. Machine learning algorithms operate by constructing a model with parameters that can be learned from a large amount of example inputs, called a training set.<\/p>\n<p>Machine learning is typically classified into three broad categories (leading to different typical applications):<\/p>\n<ul>\n<li><a title=\"Supervised learning\" href=\"https:\/\/en.wikipedia.org\/wiki\/Supervised_learning\" target=\"_blank\" rel=\"noopener\">Supervised learning<\/a>: the computer is presented with example inputs and their desired outputs, given by a &#8220;teacher&#8221;, and the goal is to learn a general rule that maps inputs to outputs. As special cases, the input signal can be only partially available, or restricted to special feedback:<\/li>\n<li><a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Unsupervised_learning\" target=\"_blank\" rel=\"noopener\">Unsupervised learning<\/a>: no labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end (<a title=\"Feature learning\" href=\"https:\/\/en.wikipedia.org\/wiki\/Feature_learning\" target=\"_blank\" rel=\"noopener\">feature learning<\/a>);<\/li>\n<li><a href=\"https:\/\/en.wikipedia.org\/wiki\/Reinforcement_learning\" target=\"_blank\" rel=\"noopener\">Reinforcement learning<\/a>: inspired by behaviourist psychology, reinforcement learning is concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.<\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2356\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/machine-learning.png\" alt=\"\" width=\"650\" height=\"397\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/machine-learning.png 1691w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/machine-learning-300x183.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/machine-learning-768x469.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/machine-learning-1024x625.png 1024w\" sizes=\"(max-width: 650px) 100vw, 650px\" \/><\/p>\n<p>There are so many approaches to <strong>Machine Learning<\/strong> that it is difficult to get an exhaustive overview of ML algorithms. This selected list tries to summarize the most common approaches :<\/p>\n<ul>\n<li><a title=\"Decision tree learning\" href=\"https:\/\/en.wikipedia.org\/wiki\/Decision_tree_learning\" target=\"_blank\" rel=\"noopener\">Decision tree learning<\/a> uses a <a title=\"Decision tree\" href=\"https:\/\/en.wikipedia.org\/wiki\/Decision_tree\" target=\"_blank\" rel=\"noopener\">decision tree<\/a> as a <a title=\"Predictive modelling\" href=\"https:\/\/en.wikipedia.org\/wiki\/Predictive_modelling\" target=\"_blank\" rel=\"noopener\">predictive model<\/a>, which maps observations about an item to conclusions about the item&#8217;s target value;<\/li>\n<li><a class=\"mw-redirect\" title=\"Support vector machines\" href=\"https:\/\/en.wikipedia.org\/wiki\/Support_vector_machines\" target=\"_blank\" rel=\"noopener\">Support vector machines<\/a> (SVM) are a set of related supervised learning methods used for <a title=\"Statistical classification\" href=\"https:\/\/en.wikipedia.org\/wiki\/Statistical_classification\" target=\"_blank\" rel=\"noopener\">classification<\/a> and <a title=\"Regression analysis\" href=\"https:\/\/en.wikipedia.org\/wiki\/Regression_analysis\" target=\"_blank\" rel=\"noopener\">regression<\/a>. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other;<\/li>\n<li><a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Genetic_algorithm\" target=\"_blank\" rel=\"noopener\">Genetic algorithm<\/a> (GA) is a search heuristic that mimics the process of <a title=\"Natural selection\" href=\"https:\/\/en.wikipedia.org\/wiki\/Natural_selection\" target=\"_blank\" rel=\"noopener\">natural selection<\/a>, and uses methods such as <a title=\"Mutation (genetic algorithm)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Mutation_(genetic_algorithm)\" target=\"_blank\" rel=\"noopener\">mutation<\/a> and <a title=\"Crossover (genetic algorithm)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Crossover_(genetic_algorithm)\" target=\"_blank\" rel=\"noopener\">crossover<\/a> to generate new <a title=\"Chromosome (genetic algorithm)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Chromosome_(genetic_algorithm)\" target=\"_blank\" rel=\"noopener\">genotype<\/a> in the hope of finding good solutions to a given problem;<\/li>\n<li><a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis\" target=\"_blank\" rel=\"noopener\">Cluster analysis<\/a> is the assignment of a set of observations into subsets (clusters) so that observations within the same cluster are similar according to some predesignated criterion or criteria, while observations drawn from different clusters are dissimilar;<\/li>\n<li><a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Bayesian_network\" target=\"_blank\" rel=\"noopener\">Bayesian networks<\/a> (or belief networks) are <a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/Graphical_model\" target=\"_blank\" rel=\"noopener\">probabilistic graphical models<\/a> that represents a set of random variables and their conditional independencies via <a title=\"Directed acyclic graph\" href=\"https:\/\/en.wikipedia.org\/wiki\/Directed_acyclic_graph\" target=\"_blank\" rel=\"noopener\">directed acyclic graphs<\/a>. They are used for&nbsp;<a title=\"Inference\" href=\"https:\/\/en.wikipedia.org\/wiki\/Inference\">inference<\/a> and learning;<\/li>\n<li><a title=\"Artificial neural network\" href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neural_network\" target=\"_blank\" rel=\"noopener\">Artificial neural network<\/a> are learning algorithm inspired by <a class=\"mw-redirect\" title=\"Biological neural networks\" href=\"https:\/\/en.wikipedia.org\/wiki\/Biological_neural_networks\" target=\"_blank\" rel=\"noopener\">biological neural networks<\/a>. Computations are structured in terms of an interconnected group of <a title=\"Artificial neuron\" href=\"https:\/\/en.wikipedia.org\/wiki\/Artificial_neuron\" target=\"_blank\" rel=\"noopener\">artificial neurons<\/a>, processing information. Modern neural networks are <a class=\"mw-redirect\" title=\"Non-linear\" href=\"https:\/\/en.wikipedia.org\/wiki\/Non-linear\" target=\"_blank\" rel=\"noopener\">non-linear<\/a> <a class=\"mw-redirect\" title=\"Statistical\" href=\"https:\/\/en.wikipedia.org\/wiki\/Statistical\">statistical<\/a> <a title=\"Data modeling\" href=\"https:\/\/en.wikipedia.org\/wiki\/Data_modeling\">data modeling<\/a> tools. They are usually used to model complex relationships between inputs and outputs, to <a title=\"Pattern recognition\" href=\"https:\/\/en.wikipedia.org\/wiki\/Pattern_recognition\">find patterns<\/a> in data, or to capture the statistical structure in an unknown <a title=\"Joint probability distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Joint_probability_distribution\" target=\"_blank\" rel=\"noopener\">joint probability distribution<\/a> between observed variables.<\/li>\n<\/ul>\n<p>Let us focus on <strong>Artificial Neural Networks<\/strong>. The first formal artificial neuron model was proposed by&nbsp;<a class=\"mw-redirect\" title=\"Warren McCulloch\" href=\"https:\/\/en.wikipedia.org\/wiki\/Warren_McCulloch\" target=\"_blank\" rel=\"noopener\">Warren McCulloch<\/a>&nbsp;and&nbsp;<a title=\"Walter Pitts\" href=\"https:\/\/en.wikipedia.org\/wiki\/Walter_Pitts\" target=\"_blank\" rel=\"noopener\">Walter Pitts<\/a>&nbsp;in 1943.&nbsp;<\/p>\n<p>For a given artificial neuron <img src='https:\/\/s0.wp.com\/latex.php?latex=k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k' title='k' class='latex' \/>, let there be <img src='https:\/\/s0.wp.com\/latex.php?latex=m&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='m' title='m' class='latex' \/> inputs signals <img src='https:\/\/s0.wp.com\/latex.php?latex=x_1%2C+x_2%2C+%5Ccdots%2C+x_m&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x_1, x_2, \\cdots, x_m' title='x_1, x_2, \\cdots, x_m' class='latex' \/>. In this model, each input <img src='https:\/\/s0.wp.com\/latex.php?latex=x_j&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x_j' title='x_j' class='latex' \/> is associated to a synaptic weight <img src='https:\/\/s0.wp.com\/latex.php?latex=w_j&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='w_j' title='w_j' class='latex' \/>. The output <img src='https:\/\/s0.wp.com\/latex.php?latex=y_k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='y_k' title='y_k' class='latex' \/> is given by a transfer function (also called activation function) applied on the pondered sum of the input signals.<\/p>\n<p>In the <strong>McCulloch-Pitts<\/strong> model, a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Heaviside_step_function\" target=\"_blank\" rel=\"noopener\">Heaviside step function<\/a> was used as a transfer function. Nowadays, a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Sigmoid_function\" target=\"_blank\" rel=\"noopener\">sigmoid<\/a> function <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvarphi%28x%29+%5Cin+%5D-1%2C%2B1%5B&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\varphi(x) \\in ]-1,+1[' title='\\varphi(x) \\in ]-1,+1[' class='latex' \/> is often used instead, in order to introduce nonlinearity to the algorithm:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2413\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuron-transfer-function.png\" alt=\"\" width=\"500\" height=\"473\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuron-transfer-function.png 1292w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuron-transfer-function-300x284.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuron-transfer-function-768x726.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuron-transfer-function-1024x968.png 1024w\" sizes=\"(max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>To combine these neurons together, one uses the output of a neuron as one the inputs of another neuron. Combined together, neurons can act as any <a href=\"https:\/\/en.wikipedia.org\/wiki\/Logic_gate\" target=\"_blank\" rel=\"noopener\">logical gate<\/a>. Furthermore, if one introduces loops in the neural network, it can be used as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Turing_machine\" target=\"_blank\" rel=\"noopener\">Turing machine<\/a>.<\/p>\n<p>In the case where <img src='https:\/\/s0.wp.com\/latex.php?latex=w_%7Bij%7D+%3D+w_%7Bji%7D%2C+%5Cforall+i%2Cj&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='w_{ij} = w_{ji}, \\forall i,j' title='w_{ij} = w_{ji}, \\forall i,j' class='latex' \/> (connections are symmetric) and <img src='https:\/\/s0.wp.com\/latex.php?latex=w_%7Bii%7D%3D0%2C+%5Cforall+i&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='w_{ii}=0, \\forall i' title='w_{ii}=0, \\forall i' class='latex' \/> (no unit has a connection with itself), one obtains a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hopfield_network\" target=\"_blank\" rel=\"noopener\">Hopfield neural network<\/a> (which accounts for <a title=\"Association (psychology)\" href=\"https:\/\/en.wikipedia.org\/wiki\/Association_(psychology)\">associative<\/a> <a title=\"Memory\" href=\"https:\/\/en.wikipedia.org\/wiki\/Memory\" target=\"_blank\" rel=\"noopener\">memory<\/a>):<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2411\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/hnn.png\" alt=\"\" width=\"300\" height=\"372\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/hnn.png 707w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/hnn-242x300.png 242w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>Conversely, a <b>feedforward neural network<\/b>&nbsp;is an&nbsp;artificial neural network&nbsp;wherein connections between the nodes do&nbsp;not&nbsp;form a cycle:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2416\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/fnn.png\" alt=\"\" width=\"250\" height=\"229\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/fnn.png 538w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/fnn-300x274.png 300w\" sizes=\"(max-width: 250px) 100vw, 250px\" \/><\/p>\n<p><strong>Deep learning<\/strong> (DL) is a subset of machine learning. It utilizes a hierarchical level of artificial neural networks to carry out the process of machine learning. While traditional programs build analysis with data in a linear way, the hierarchical function of deep learning systems enables machines to process data with a nonlinear approach.<\/p>\n<p>DL is a specific approach used for building and training neural networks. An algorithm is considered to be deep if the input data is passed through a series of nonlinear transformations before it becomes output.<\/p>\n<p>DL removes the manual identification of features in data and, instead, relies on whatever training process it has in order to discover the useful patterns in the input examples. This makes training the neural network easier and faster, and it can yield a better result.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2420\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/deeplearningnnetwork.png\" alt=\"\" width=\"700\" height=\"306\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/deeplearningnnetwork.png 1643w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/deeplearningnnetwork-300x131.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/deeplearningnnetwork-768x336.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/deeplearningnnetwork-1024x448.png 1024w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>Be it <em>classical<\/em> or <em>quantum<\/em>, the main job of artificial neural networks is to recognize patterns. Do to so, when data is fed into the network, each artificial neuron that fires transmits a signal to certain neurons in the next layer, which are likely to fire if multiple signals are received.<\/p>\n<p>The first layer of the neural network processes a raw data input and passes it on to the next layer as output. A second layer processes the previous layer\u2019s information by including additional information and passes on its result. The next layer takes the second layer\u2019s information and, in turn, includes other raw data to make the machine\u2019s pattern even better. This continues across all levels of the neuron network:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2408\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuralnetdog.png\" alt=\"\" width=\"400\" height=\"240\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuralnetdog.png 977w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuralnetdog-300x180.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/neuralnetdog-768x461.png 768w\" sizes=\"(max-width: 400px) 100vw, 400px\" \/><\/p>\n<p>The process filters out noise and retains only the most relevant features. The initial layer accepts input (such as the pixels of an image), intermediate layers create various combinations of the input (representing structures such as edges and geometric shapes) and a final layer produces output (a high-level description of the image content). The &#8220;wiring&#8221; is not fixed in advance, but adapts in a process of trials and errors.<\/p>\n<p>On a <em>classical<\/em> <em>computer<\/em>, all these interconnections are represented by an enormous matrix of numbers. Running the network means doing matrix algebra. Typically, these heavy matrix operations are outsourced to specialized chips such as a GPUs (Graphics Processing Units).&nbsp;<\/p>\n<p>And this is where <em>quantum computing<\/em> can enters.<\/p>\n<p><strong>Amplitude encoding<\/strong><\/p>\n<p>Let&#8217;s explore the impact of quantum computing on AI with <strong>amplitude encoding<\/strong>.<\/p>\n<p>Generally speaking, two qubits system have four joint states:<\/p>\n<ul>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%7C00%5Crangle+%3D+%7C0%5Crangle+%5Cotimes+%7C0%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|00\\rangle = |0\\rangle \\otimes |0\\rangle' title='|00\\rangle = |0\\rangle \\otimes |0\\rangle' class='latex' \/>;<\/li>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%7C01%5Crangle+%3D+%7C0%5Crangle+%5Cotimes+%7C1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|01\\rangle = |0\\rangle \\otimes |1\\rangle' title='|01\\rangle = |0\\rangle \\otimes |1\\rangle' class='latex' \/>;<\/li>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%7C10%5Crangle+%3D+%7C1%5Crangle+%5Cotimes+%7C0%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|10\\rangle = |1\\rangle \\otimes |0\\rangle' title='|10\\rangle = |1\\rangle \\otimes |0\\rangle' class='latex' \/>;<\/li>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%7C11%5Crangle+%3D+%7C1%5Crangle+%5Cotimes+%7C1%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|11\\rangle = |1\\rangle \\otimes |1\\rangle' title='|11\\rangle = |1\\rangle \\otimes |1\\rangle' class='latex' \/>.<\/li>\n<\/ul>\n<p>Let&#8217;s associate the <a title=\"Probability amplitude\" href=\"https:\/\/en.wikipedia.org\/wiki\/Probability_amplitude\">probability amplitudes<\/a> of a quantum state with the inputs and outputs of computations.&nbsp; Each state has a certain amplitude (or weight) that can represent a neuron.<\/p>\n<p>If you add a third qubit, you can represent eight neurons; a fourth, 16. Since a state of <span class=\"mwe-math-element\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\"> n <\/span><\/span>qubits is described by&nbsp;<span class=\"mwe-math-element\"><img decoding=\"async\" class=\"mwe-math-fallback-image-inline\" src=\"https:\/\/wikimedia.org\/api\/rest_v1\/media\/math\/render\/svg\/8226f30650ee4fe4e640c6d2798127e80e9c160d\" alt=\"2^{n}\" aria-hidden=\"true\"><\/span> complex amplitudes, <strong>this information encoding can allow for an exponentially compact representation<\/strong>.<\/p>\n<p>It is estimated that 60 qubits would be enough to encode an amount of data equivalent to that produced by humanity in a year.<\/p>\n<p>When you act on a state of four qubits, you are processing 16 numbers at a stroke, whereas a classical computer would have to go through those numbers one by one. The goal of algorithms based on amplitude encoding is to formulate <strong>quantum algorithms<\/strong> whose resources grow <strong>polynomially<\/strong> in the number of qubits<span class=\"mwe-math-element\"><span class=\"mwe-math-mathml-inline mwe-math-mathml-a11y\">,<\/span><\/span> which amounts to a<strong> logarithmic growth in the number of amplitudes<\/strong> and thereby the dimension of the input.<\/p>\n<p><strong>Quantum algorithm for linear systems (HHL)<\/strong><span style=\"color: #999999;\"><strong><br \/>\n<\/strong><\/span><\/p>\n<p>Nevertheless, quantum computers\u2019 ability to store information compactly doesn\u2019t make it faster. In 2009, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Seth_Lloyd\" target=\"_blank\" rel=\"noopener\">Seth Lloyd<\/a> and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Aram_Harrow\" target=\"_blank\" rel=\"noopener\">Aram Harrow<\/a> from the MIT and&nbsp;<a href=\"http:\/\/u.cs.biu.ac.il\/~avinatan\/\" target=\"_blank\" rel=\"noopener\">Avinatan Hassidim<\/a> from the Bar-Ilan University in Israel, designed a crucial quantum algorithm (now known as the <strong>HHL<\/strong> algorithm) to for solving <a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/System_of_linear_equations\" target=\"_blank\" rel=\"noopener\">linear systems<\/a> <img src='https:\/\/s0.wp.com\/latex.php?latex=A%7Cx%5Crangle%3D%7Cb%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='A|x\\rangle=|b\\rangle' title='A|x\\rangle=|b\\rangle' class='latex' \/>.<\/p>\n<p>Exploring the details of HHL goes beyond this post, but you can have a look at their seminal paper &#8220;<a href=\"https:\/\/arxiv.org\/abs\/0811.3171\" target=\"_blank\" rel=\"noopener\"><em>Quantum algorithm for solving linear systems of equations<\/em><\/a>&#8220;.<\/p>\n<p>To be precise, HHL is not <em>exactly<\/em> an algorithm for solving a system of linear equations in logarithmic time. Rather, it\u2019s an algorithm for preparing a quantum superposition of the form <img src='https:\/\/s0.wp.com\/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\\rangle' title='|x\\rangle' class='latex' \/>, where <img src='https:\/\/s0.wp.com\/latex.php?latex=%7Cx%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x\\rangle' title='|x\\rangle' class='latex' \/> is the solution to the linear system <img src='https:\/\/s0.wp.com\/latex.php?latex=A%7Cx%5Crangle%3D%7Cb%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='A|x\\rangle=|b\\rangle' title='A|x\\rangle=|b\\rangle' class='latex' \/> (<img src='https:\/\/s0.wp.com\/latex.php?latex=A&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='A' title='A' class='latex' \/> being Hamiltonian).<\/p>\n<p>Even though <strong>HHL has known caveats<\/strong>, it is one of the most important subroutines in many quantum machine learning algorithms.<\/p>\n<p>For example, the quantum algorithm for linear systems of equations has been applied to a <strong>support vector machine<\/strong>. Rebentrost <em>et al<\/em>. have shown that a quantum support vector machine can be used for big data classification and achieve an exponential speedup over classical computers.<\/p>\n<p><strong>Amplitude amplification<\/strong><span style=\"color: #999999;\"><strong><br \/>\n<\/strong><\/span><\/p>\n<p>The HHL algorithm is one of the main fundamental algorithms expected to provide a speedup over their classical counterparts, along with <a title=\"Grover's algorithm\" href=\"https:\/\/en.wikipedia.org\/wiki\/Grover%27s_algorithm\" target=\"_blank\" rel=\"noopener\">Grover&#8217;s search algorithm<\/a> or <a class=\"mw-redirect\" title=\"Shor's Algorithm\" href=\"https:\/\/en.wikipedia.org\/wiki\/Shor%27s_Algorithm\" target=\"_blank\" rel=\"noopener\">Shor&#8217;s factoring algorithm<\/a>:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-1971\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-circuit-shor.png\" alt=\"\" width=\"600\" height=\"329\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-circuit-shor.png 1437w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-circuit-shor-300x164.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-circuit-shor-768x421.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-circuit-shor-1024x561.png 1024w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/p>\n<p>Another approach to improving classical machine learning with quantum information processing uses <a title=\"Amplitude amplification\" href=\"https:\/\/en.wikipedia.org\/wiki\/Amplitude_amplification\" target=\"_blank\" rel=\"noopener\">amplitude amplification<\/a> methods based on Grover&#8217;s search algorithm:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2037\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-grover.png\" alt=\"\" width=\"550\" height=\"134\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-grover.png 1374w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-grover-300x73.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-grover-768x187.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/04\/qec-grover-1024x250.png 1024w\" sizes=\"(max-width: 550px) 100vw, 550px\" \/><br \/>\nMany AI problems can be reduced to searching:&nbsp; planning, scheduling, theorem proving, etc.&nbsp; Indeed, it has been shown that amplitude amplification can solve unstructured search problems with a quadratic speedup compared to classical algorithms.<\/p>\n<p>These quantum routines can be employed for learning algorithms that translate into an unstructured search task, as can be done for instance in the case of the <a title=\"K-medians clustering\" href=\"https:\/\/en.wikipedia.org\/wiki\/K-medians_clustering\" target=\"_blank\" rel=\"noopener\">k-medians<\/a> and the <a class=\"mw-redirect\" title=\"K-nearest neighbour\" href=\"https:\/\/en.wikipedia.org\/wiki\/K-nearest_neighbour\" target=\"_blank\" rel=\"noopener\">k-nearest neighbors algorithms<\/a>. Amplitude amplification is often combined with quantum walks to achieve the same quadratic speedup. Quantum walks have been proposed to enhance Google&#8217;s PageRank algorithm for example.<\/p>\n<p><strong>Quantum Neural Networks<\/strong><\/p>\n<p>Computer scientists are trying to combine artificial neural network models with the advantages of quantum information in order to develop more efficient algorithms. The hope is that features of quantum computing such as quantum parallelism or the effects of interference and entanglement can be used as resources. One important motivation for these investigations is the difficulty to train classical neural networks, especially in big data applications.<\/p>\n<p><strong>Quantum neural network<\/strong> research is still in its infancy. Most of the proposed models are based on the idea of replacing classical McCulloch-Pitts neurons with a <a title=\"Qubit\" href=\"https:\/\/en.wikipedia.org\/wiki\/Qubit\">qubit<\/a> (sometimes called a &#8220;quron\u201d), resulting in neural units that can be in a superposition of the state \u2018firing\u2019 and \u2018resting\u2019:<\/p>\n<ul>\n<li>The McCulloch-Pitts neuron <img src='https:\/\/s0.wp.com\/latex.php?latex=x_n+%5Cin+%5D-1%2C%2B1%5B&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x_n \\in ]-1,+1[' title='x_n \\in ]-1,+1[' class='latex' \/> is replaced by a qubit (quron) <img src='https:\/\/s0.wp.com\/latex.php?latex=%7Cx_n%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|x_n\\rangle' title='|x_n\\rangle' class='latex' \/> in 2-dimensional Hilbert space <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cmathcal%7BH%7D%5E2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\mathcal{H}^2' title='\\mathcal{H}^2' class='latex' \/> with basis <img src='https:\/\/s0.wp.com\/latex.php?latex=%5C%7B%7C0%5Crangle%2C%7C1%5Crangle%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\{|0\\rangle,|1\\rangle\\}' title='\\{|0\\rangle,|1\\rangle\\}' class='latex' \/>;<\/li>\n<li>The state <img src='https:\/\/s0.wp.com\/latex.php?latex=%7C%5Cpsi%5Crangle&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='|\\psi\\rangle' title='|\\psi\\rangle' class='latex' \/> of a network of <img src='https:\/\/s0.wp.com\/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' \/> qurons becomes then a quantum state in a <img src='https:\/\/s0.wp.com\/latex.php?latex=2%5En&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2^n' title='2^n' class='latex' \/>-dimensional Hilbert space <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cmathcal%7BH%7D%5E%7B2n%7D+%3D+%5Cmathcal%7BH%7D+%5Cotimes+%5Ccdots+%5Cotimes+%5Cmathcal%7BH%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\mathcal{H}^{2n} = \\mathcal{H} \\otimes \\cdots \\otimes \\mathcal{H}' title='\\mathcal{H}^{2n} = \\mathcal{H} \\otimes \\cdots \\otimes \\mathcal{H}' class='latex' \/> with basis <img src='https:\/\/s0.wp.com\/latex.php?latex=%5C%7B+%7C00+%5Ccdots+0+%5Crangle%2C+%5Ccdots%2C+%7C11+%5Ccdots+1+%5Crangle+%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\{ |00 \\cdots 0 \\rangle, \\cdots, |11 \\cdots 1 \\rangle \\}' title='\\{ |00 \\cdots 0 \\rangle, \\cdots, |11 \\cdots 1 \\rangle \\}' class='latex' \/>.<\/li>\n<\/ul>\n<div>Apart from the quron, proposals for Quantum Neural Networks (QNN) models vary strongly in their proximity to the idea of neural networks. In order to access their scope, there are three minimum requirements for a meaningful QNN that is based on the Hopfield neural network model and contains the feature of associative memory.<\/div>\n<div>&nbsp;<\/div>\n<div>These requirements can be summarized as:<\/div>\n<div>&nbsp;<\/div>\n<ol>\n<li>The initial state of the quantum system encodes any binary string of length.&nbsp;<\/li>\n<li>The QNN reflects one or more basic neural computing mechanisms (attractor dynamics, synaptic connections, integrate &amp; fire, training rules, &#8230;)<\/li>\n<li>The evolution is based on quantum effects, such as superposition, entanglement and interference, and it is fully consistent with quantum theory.<\/li>\n<\/ol>\n<p>Let&#8217;s have a look on how QNN could be applied to feedforward neural network:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2437\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/sfnn.png\" alt=\"\" width=\"250\" height=\"103\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/sfnn.png 528w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/sfnn-300x123.png 300w\" sizes=\"(max-width: 250px) 100vw, 250px\" \/><\/p>\n<p>Two problems arise with regard to quantum computations:<\/p>\n<ul>\n<li>The quantum operations need to be reversible and unitary. The classical transfer function <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvarphi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\varphi' title='\\varphi' class='latex' \/> is not reversible as it squeezes the two dimensional vector to a single number <img src='https:\/\/s0.wp.com\/latex.php?latex=y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='y' title='y' class='latex' \/>;<\/li>\n<li>Due to the <a title=\"\" href=\"https:\/\/en.wikipedia.org\/wiki\/No-cloning_theorem\" target=\"_blank\" rel=\"noopener\">no-cloning theorem<\/a>, the copying of the output before sending it to several neurons in the next layer is non-trivial<\/li>\n<\/ul>\n<p>In order to fix irreversibility of classical neural networks, a dummy bit in state <img src='https:\/\/s0.wp.com\/latex.php?latex=0&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='0' title='0' class='latex' \/> is added to the input vector and the activation function <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvarphi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\varphi' title='\\varphi' class='latex' \/> is changed to output <img src='https:\/\/s0.wp.com\/latex.php?latex=x_1%2C+x_2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='x_1, x_2' title='x_1, x_2' class='latex' \/> and <img src='https:\/\/s0.wp.com\/latex.php?latex=y&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='y' title='y' class='latex' \/>. The function <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvarphi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\varphi' title='\\varphi' class='latex' \/> has now three inputs and three outputs making the operation reversible:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2435\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/rfnn.png\" alt=\"\" width=\"250\" height=\"143\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/rfnn.png 537w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/rfnn-300x172.png 300w\" sizes=\"(max-width: 250px) 100vw, 250px\" \/><br \/>\nThe last step is to change reversible function <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cvarphi&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\varphi' title='\\varphi' class='latex' \/> into an unitary quantum gate <img src='https:\/\/s0.wp.com\/latex.php?latex=U&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='U' title='U' class='latex' \/>, which inputs and outputs are quantum states. The graph below illustrates the final approach:<\/p>\n<div>&nbsp;<\/div>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2436\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qrfnn.png\" alt=\"\" width=\"250\" height=\"126\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qrfnn.png 711w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qrfnn-300x151.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qrfnn-600x300.png 600w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qrfnn-400x200.png 400w\" sizes=\"(max-width: 250px) 100vw, 250px\" \/><\/p>\n<div>An arbitrary unitary matrix acting on <img src='https:\/\/s0.wp.com\/latex.php?latex=n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='n' title='n' class='latex' \/> qubits can be represented as :<\/div>\n<div>&nbsp;<\/div>\n<div style=\"text-align: center;\"><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+U+%3D+%5Ctextrm%7Bexp%7D%5CBig%5B+i+%5Cbig%28+%5Csum_%7Bk_1%2C%5Ccdots%2C+k_n%7D+%5Calpha_%7Bk_1%2C%5Ccdots%2C+k_n%7D+%28%5Csigma_%7Bk_1%7D%5Cotimes%5Ccdots%5Cotimes%5Csigma_%7Bk_n%7D%29%5Cbig%29%5CBig%5D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle U = \\textrm{exp}\\Big[ i \\big( \\sum_{k_1,\\cdots, k_n} \\alpha_{k_1,\\cdots, k_n} (\\sigma_{k_1}\\otimes\\cdots\\otimes\\sigma_{k_n})\\big)\\Big] ' title='\\displaystyle U = \\textrm{exp}\\Big[ i \\big( \\sum_{k_1,\\cdots, k_n} \\alpha_{k_1,\\cdots, k_n} (\\sigma_{k_1}\\otimes\\cdots\\otimes\\sigma_{k_n})\\big)\\Big] ' class='latex' \/><\/div>\n<div>&nbsp;<\/div>\n<p>where <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Csigma_k&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\sigma_k' title='\\sigma_k' class='latex' \/> are Pauli matrices (for <img src='https:\/\/s0.wp.com\/latex.php?latex=k%5Cin%5C%7B1%2C2%2C3%5C%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='k\\in\\{1,2,3\\}' title='k\\in\\{1,2,3\\}' class='latex' \/>) and <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Csigma_0&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\sigma_0' title='\\sigma_0' class='latex' \/> is the <img src='https:\/\/s0.wp.com\/latex.php?latex=2%5Ctimes+2&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='2\\times 2' title='2\\times 2' class='latex' \/> identity matrix. The parameters <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Calpha&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\alpha' title='\\alpha' class='latex' \/> are trainable parameters.<\/p>\n<p>This is a <strong>different<\/strong> approach in comparison to classical neural networks as the <strong>transfer functions<\/strong> (instead of numeric weights) <strong>represented by<\/strong> <strong>unitary matrices are trainable<\/strong>.<\/p>\n<p>The benefits of quantum neural networks in comparison to the classical algorithm are not yet specified. A potential speed-up can come from a superposition, however the assessment of unitary matrix learning speed is difficult. Nevertheless, the quantum neural networks algorithm is an interesting and promising field of quantum machine learning algorithms that is currently under development.<\/p>\n<p><strong>Quantum Boltzmann Machine<\/strong><\/p>\n<p>A <a href=\"https:\/\/en.wikipedia.org\/wiki\/Boltzmann_machine\" target=\"_blank\" rel=\"noopener\">Boltzmann machine<\/a> (BM) is a type of <a title=\"Stochastic neural network\" href=\"https:\/\/en.wikipedia.org\/wiki\/Stochastic_neural_network\" target=\"_blank\" rel=\"noopener\">stochastic<\/a> <a title=\"Recurrent neural network\" href=\"https:\/\/en.wikipedia.org\/wiki\/Recurrent_neural_network\" target=\"_blank\" rel=\"noopener\">recurrent neural network<\/a> (and <a title=\"Markov random field\" href=\"https:\/\/en.wikipedia.org\/wiki\/Markov_random_field\" target=\"_blank\" rel=\"noopener\">Markov random field<\/a>). They are named after the <a title=\"Boltzmann distribution\" href=\"https:\/\/en.wikipedia.org\/wiki\/Boltzmann_distribution\">Boltzmann distribution<\/a> in statistical mechanics (this is why they are classified as Energy Based Models).&nbsp;BMs are classic machine learning techniques, and serve as the basis of deep learning models such as deep belief networks and deep Boltzmann machines.<\/p>\n<p>A BM commonly consists of visible and hidden binary units <img src='https:\/\/s0.wp.com\/latex.php?latex=z_n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='z_n' title='z_n' class='latex' \/>:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2480\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/bm.png\" alt=\"\" width=\"300\" height=\"206\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/bm.png 802w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/bm-300x206.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/bm-768x529.png 768w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>The quadratic energy function over binary units <img src='https:\/\/s0.wp.com\/latex.php?latex=z_n&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='z_n' title='z_n' class='latex' \/> is referred to as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ising_model\" target=\"_blank\" rel=\"noopener\">Ising model<\/a> :<\/p>\n<p style=\"text-align: center;\"><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+E_z+%3D+-+%5Csum_a+b_a+z_a+-+%5Csum_%7Ba%2Cb%7D+w_%7Bab%7Dz_a+z_b&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle E_z = - \\sum_a b_a z_a - \\sum_{a,b} w_{ab}z_a z_b' title='\\displaystyle E_z = - \\sum_a b_a z_a - \\sum_{a,b} w_{ab}z_a z_b' class='latex' \/><\/p>\n<p>where the dimensionless parameters <img src='https:\/\/s0.wp.com\/latex.php?latex=b_a&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='b_a' title='b_a' class='latex' \/> and <img src='https:\/\/s0.wp.com\/latex.php?latex=w_%7Bab%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='w_{ab}' title='w_{ab}' class='latex' \/> are tuned during the training. In equilibrium, the probability of observing a state <img src='https:\/\/s0.wp.com\/latex.php?latex=v&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='v' title='v' class='latex' \/> of the visible variables is given by the Boltzmann distribution summed over the hidden variables:<\/p>\n<p style=\"text-align: center;\"><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+P_v+%3D+Z%5E%7B-1%7D+%5Csum+e%5E%7BE_z%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle P_v = Z^{-1} \\sum e^{E_z}' title='\\displaystyle P_v = Z^{-1} \\sum e^{E_z}' class='latex' \/> and <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+Z+%3D+%5Csum+e%5E%7B-E_z%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle Z = \\sum e^{-E_z}' title='\\displaystyle Z = \\sum e^{-E_z}' class='latex' \/><\/p>\n<p>The goal is to determine the Hamiltonian parameters <img src='https:\/\/s0.wp.com\/latex.php?latex=%5C%7B+b_a%2Cw_%7Bab%7D+%5C%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\{ b_a,w_{ab} \\} ' title='\\{ b_a,w_{ab} \\} ' class='latex' \/> such as <img src='https:\/\/s0.wp.com\/latex.php?latex=P_v&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_v' title='P_v' class='latex' \/> becomes as close a possible to <img src='https:\/\/s0.wp.com\/latex.php?latex=P_v%5E%7B%5Cmathrm%7Bdata%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='P_v^{\\mathrm{data}}' title='P_v^{\\mathrm{data}}' class='latex' \/> defined by the training set. To achieve this, one needs to minimize the average negative log-likelihood:<\/p>\n<p style=\"text-align: center;\"><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+%5Cmathcal%7BL%7D%3D+-+%5Csum_v+P_v%5E%7B%5Cmathrm%7Bdata%7D%7D+%5Cmathrm%7Blog%7D+P_v+&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle \\mathcal{L}= - \\sum_v P_v^{\\mathrm{data}} \\mathrm{log} P_v ' title='\\displaystyle \\mathcal{L}= - \\sum_v P_v^{\\mathrm{data}} \\mathrm{log} P_v ' class='latex' \/><\/p>\n<p>To jump into the quantum realm, we now replace classical bits with qubit:<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter wp-image-2479\" src=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qbm.png\" alt=\"\" width=\"385\" height=\"197\" srcset=\"https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qbm.png 1058w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qbm-300x153.png 300w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qbm-768x393.png 768w, https:\/\/www.quantum-bits.org\/wp-content\/uploads\/2018\/07\/qbm-1024x524.png 1024w\" sizes=\"(max-width: 385px) 100vw, 385px\" \/><\/p>\n<p>For the energy function, one will then considers the Hamiltonian:<\/p>\n<ul>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+H+%3D+-+%5Csum_a+b_a+%5Csigma_a%5Ez+-+%5Csum_%7Ba%2Cb%7D+w_%7Bab%7D%5Csigma_a%5Ez+%5Csigma_b%5Ez&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle H = - \\sum_a b_a \\sigma_a^z - \\sum_{a,b} w_{ab}\\sigma_a^z \\sigma_b^z' title='\\displaystyle H = - \\sum_a b_a \\sigma_a^z - \\sum_{a,b} w_{ab}\\sigma_a^z \\sigma_b^z' class='latex' \/> &nbsp; with &nbsp; <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+%5Csigma_a%5Ez+%5Cequiv+%5Cunderbrace+%7BI+%5Cotimes+%5Ccdots+%5Cotimes+I+%7D_%7Ba-1%7D%5Cotimes+%5Csigma_z+%5Cotimes+%5Cunderbrace%7BI+%5Cotimes+%5Ccdots+%5Cotimes+I%7D_%7Bn-a%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle \\sigma_a^z \\equiv \\underbrace {I \\otimes \\cdots \\otimes I }_{a-1}\\otimes \\sigma_z \\otimes \\underbrace{I \\otimes \\cdots \\otimes I}_{n-a}' title='\\displaystyle \\sigma_a^z \\equiv \\underbrace {I \\otimes \\cdots \\otimes I }_{a-1}\\otimes \\sigma_z \\otimes \\underbrace{I \\otimes \\cdots \\otimes I}_{n-a}' class='latex' \/><\/li>\n<li><img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+I+%3D%5Cbegin%7Bbmatrix%7D+1+%26+0+%5C%5C+0+%26+1+%5Cend%7Bbmatrix%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle I =\\begin{bmatrix} 1 &amp; 0 \\\\ 0 &amp; 1 \\end{bmatrix}' title='\\displaystyle I =\\begin{bmatrix} 1 &amp; 0 \\\\ 0 &amp; 1 \\end{bmatrix}' class='latex' \/>&nbsp; and the Pauli matrix <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Cdisplaystyle+%5Csigma_z+%3D%5Cbegin%7Bbmatrix%7D+1+%26+0+%5C%5C+0+%26+-1+%5Cend%7Bbmatrix%7D&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\displaystyle \\sigma_z =\\begin{bmatrix} 1 &amp; 0 \\\\ 0 &amp; -1 \\end{bmatrix}' title='\\displaystyle \\sigma_z =\\begin{bmatrix} 1 &amp; 0 \\\\ 0 &amp; -1 \\end{bmatrix}' class='latex' \/><\/li>\n<\/ul>\n<p>The standard approach to training Boltzmann machines relies on the computation of certain averages that can be estimated by standard&nbsp;sampling&nbsp;techniques, such as&nbsp;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Markov_chain_Monte_Carlo\" target=\"_blank\" rel=\"noopener\">Markov chain Monte Carlo algorithms<\/a>. Another possibility is to rely on a physical process, like <a href=\"https:\/\/en.wikipedia.org\/wiki\/Quantum_annealing\" target=\"_blank\" rel=\"noopener\">quantum annealing<\/a>, that naturally generates samples from a Boltzmann distribution.<\/p>\n<p>The objective is to find the optimal control parameters that best represent the empirical distribution of a given dataset.<\/p>\n<p><strong>Other approaches<br \/>\n<\/strong><\/p>\n<p>There many other approaches to quantum AI, and a simple post won&#8217;t be nearly enough to get into all of them. To name a few:<\/p>\n<ul>\n<li>Quantum principal component analysis<\/li>\n<li>Quantum <em>k<\/em>-means algorithm<\/li>\n<li>Quantum <em>k<\/em>-medians algorithm<\/li>\n<li>Quantum Bayesian Networks<\/li>\n<li>etc.<\/li>\n<\/ul>\n<p>I would recommend this paper to explore a few of them in an accessible and consistent way: <a href=\"https:\/\/arxiv.org\/pdf\/1804.10068.pdf\" target=\"_blank\" rel=\"noopener\">Quantum machine learning for data scientists<\/a>.<\/p>\n<p>&nbsp;<br \/>\n<span style=\"font-size: 8pt;\">Note: A few paragraphs and illustrations of this post are based selected papers on quantum AI and wikipedia articles.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Mixing quantum computing and Artificial Intelligence (AI) may sound like a new buzzword. However, since quantum computing advances are hinting at profound changes in the very notions of computation, it &#8230;<\/p>\n","protected":false},"author":1,"featured_media":3846,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0},"categories":[6],"tags":[],"_links":{"self":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts\/2336"}],"collection":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2336"}],"version-history":[{"count":0,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/posts\/2336\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=\/wp\/v2\/media\/3846"}],"wp:attachment":[{"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2336"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2336"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.quantum-bits.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2336"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}