Paper

Quantum pattern recognition in photonic circuits

, , , and

Published 16 November 2021 © 2021 IOP Publishing Ltd. All rights, including for text and data mining, AI training, and similar technologies, are reserved.
, , Citation Rui Wang et al 2022 Quantum Sci. Technol. 7 015010DOI 10.1088/2058-9565/ac3460

2058-9565/7/1/015010

Abstract

This paper proposes a machine learning method to characterize photonic states via a simple optical circuit and data processing of photon number distributions, such as photonic patterns. The input states consist of two coherent states used as references and a two-mode unknown state to be studied. We successfully trained supervised learning algorithms that can predict the degree of entanglement in the two-mode state as well as perform the full tomography of one photonic mode, obtaining satisfactory values in the considered regression metrics.

Export citation and abstractBibTeXRIS

1. Introduction

Quantum information processing deals with the manipulation of quantum states in order to perform quantum informational tasks such as quantum algorithms [1], quantum error correction [2], quantum cryptography [3], or quantum teleportation [4]. It is known that quantum information has the potential to outperform classical information protocols [59]. However, since a quantum system is modified after measurement, extracting arbitrary information from a quantum state may require many copies. In general, quantum tomography (QT) can be employed for the full reconstruction of a quantum state or a quantum operator. In a nutshell, QT implies the measurement of the expectation values of several operators or the use of a mutually unbiased basis [1012], which is in general a hard experimental task. In the last years, some efficient protocols have been proposed assisted by machine learning (ML) algorithms, although it may still be tricky depending on the dimension of the quantum system [1318].

Quantum information can be encoded, decoded, and manipulated in a variety of physical systems, for example, photonic systems [19], solid state devices [20], trapped ions [21, 22], and superconducting architectures [23, 24]. Photonic platforms present several advantages for quantum information protocols, such as long coherence times and full connectivity, allowing long-distance quantum communication and quantum key distribution, among many other achievements [2527]. Different photonic degrees of freedom, including polarization, spectral, spatial, and temporal modes can be used to encode information [19], providing a huge variety of experimental resources for many quantum information tasks.

One of the most intriguing experimental resources for photonic platform is boson sampling, which is a model of non-universal quantum computation. It consists in measuring or sampling the quantity of photons in a distribution produced by a linear interference device given an initial photonic Fock state [28]. It is a complex problem that cannot be efficiently simulated on conventional computers. However, with accessible experimental requirements, it can be simulated on linear optical quantum computers. It is attractive that the experimental setup of boson sampling only requires single-photon sources, photodetectors, and linear optical elements, i.e. beam splitters and phase shifters [29]. Such a feasibility has encouraged and inspired many research teams for lab implementations [3034]. Particularly, recent experiments have shown the potential of photonic technologies to obtain quantum advantage for scientific relevant problems; for instance, quantum supremacy of Gaussian boson sampling with 76 photons [7], and the so-called timestamp boson sampling, that brings to light the potential of circuits to get memory effects, thus paving the way for the implementation of neuromorphic quantum computing [35, 36].

On a separate note, it is known that ML algorithms help to extract information from large and complex data sets, encompassing different techniques with sound mathematical grounds [37, 38]. The information is extracted by means of learning algorithms, that teach models which usually have no a priori knowledge of the problem. The increasing availability and size of data sets has spread the use of ML [39, 40], oftentimes involving reliable applications at academic and commercial levels.

The main goal of this manuscript is to show a simple and experimentally feasible photonic architecture capable to produce photonic patterns which assisted by ML algorithms could give an estimation about quantum features hard to calculate experimentally. To this end, we consider whether the full tomography of a photonic mode and two-mode entanglement can be extracted via sampling the output distribution of photons. In order to obtain a relation between the distribution probability of the output state and the information of the unknown state, we propose a particular linear optical circuit with four spatial modes, and calculate the corresponding permanent of the submatrix of the unitary matrix. We make use of a data set of probability patterns, that corresponds with the photon number distributions of output states. Then, we change the parameters of the unknown state to build a supervised learning algorithm for estimating the state of a new probability pattern, distinct from the training one.

The rest of the article is organized as follows. In section 2, we briefly review the most important aspects of boson sampling. Sections 3 and 4 introduce our four-mode optical circuit for state estimation of a photonic-mode, and entanglement estimation for a bipartite system, respectively. The results of the ML approach for state estimation and entanglement estimation are shown in section 5, ending up the paper with the conclusions of the work in section 6.

2. Boson-sampling model

A boson-sampling experiment consists in measuring the photon number probability produced by the interference of N photonic states, usually indistinguishable single-photon states, via an M-mode linear network. The distribution can be obtained by computing permanents of the submatrix derived from the unitary transformation matrix of the network [41]. The calculation of the permanent for an n × n complex matrix is a #P-hard task for classical computers and #P-complete for a (0, 1)-matrix [42], which means that the simulation of a boson-sampling experiment by classical devices is inefficient.

Without loss of generality, the output state of an M-mode optical circuit after the interference of an N-photon state can be calculated as follows. First, the input state is given by

Equation (1)

where ${\hat{a}}_{k}^{{\dagger}}$ is the photon creation operator for the kth mode, Ik is the number of photons in the kth mode, and ∑k Ik = N. The input and output states are related by a unitary transformation,

Equation (2)

where $\mathbb{U}\left\vert 0\right\rangle =\left\vert 0\right\rangle $ because a linear optical circuit preserves the photon number. Now, the operator transformation reads

Equation (3)

Again, as a linear optical circuit preserves the number of photons, then uk,j defines a unitary matrix U, which represents a superoperator that acts over the space of the creation operators. Then, the output state reads

Equation (4)

where O is a photon-mode configuration containing N photons, γO is the superposition factor of the configuration O, and $\left\vert {O}_{k}\right\rangle $ is the photon number state for the kth-output mode in the Oth configuration. The probability of measuring configuration O is given by PO = |γO |2. The probability PO is given by

Equation (5)

where Per[⋅] is the permanent and ${{\Lambda}}_{\vert {\psi }_{\text{I}}\rangle ,\vert {\psi }_{\text{O}}\rangle }$ is an N × N matrix, which can be obtained from the elements ujk of that defines U [29]. The matrix ${{\Lambda}}_{\vert {\psi }_{\text{I}}\rangle ,\vert {\psi }_{\text{O}}\rangle }$ reads

Equation (6)

where

Equation (7)

are the number of output and input photons until mode , respectively, being ${S}_{0}^{\text{I}(\text{O})}=0$. It should be borne in mind that, for any linear network with M modes, we have that ${S}_{M}^{\text{I}}={S}_{M}^{\text{O}}=N$. For example, if $\left\vert {\psi }_{\text{I}}\right\rangle =\left\vert {0}_{1}{1}_{2}{0}_{3}{2}_{4}\right\rangle $ and $\left\vert {\psi }_{\text{O}}\right\rangle =\left\vert {1}_{1}{0}_{2}{1}_{3}{1}_{4}\right\rangle $, then ${S}_{1}^{\text{I}}=0$, ${S}_{2}^{\text{I}}=1$, ${S}_{3}^{\text{I}}=1$, and ${S}_{4}^{\text{I}}=3$, while ${S}_{1}^{\text{O}}=1$, ${S}_{2}^{\text{O}}=1$, ${S}_{3}^{\text{O}}=2$, and ${S}_{4}^{\text{O}}=3$. Then, the matrix ${{\Lambda}}_{\vert {\psi }_{\text{I}}\rangle ,\vert {\psi }_{\text{O}}\rangle }$ reads

Equation (8)

3. Optical circuit for state tomography

We used a boson-sampling circuit to achieve the full QT of an unknown superposition state in the Fock basis. For obtaining more information of the unknown state, we proposed a simple optical circuit formed by three 50%–50% beam splitters, as shown in figure 1. The unitary matrix U for this circuit is given by

Equation (9)

where $a=1/\sqrt{2}$. The input states for the first and the last mode are coherent states given by

Equation (10)

with θ = 0 for the mode 1 and θ = π/2 for the mode 4. The initial state for mode 2 is the vacuum state $\left\vert {0}_{2}\right\rangle $, and for the mode 3 it is an arbitrary unknown state $\vert {\eta }_{3}\rangle ={\sum }_{\ell =0}^{N}\hspace{2pt}{r}_{\ell }\enspace {\text{e}}^{\text{i}{\phi }_{\ell }}\vert {n}_{3}\rangle $.

Figure 1. Refer to the following caption and surrounding text.

Figure 1. Core optical circuit for QT made up of three beam splitters and the initial state $\vert {\psi }_{\text{I}}\rangle =\vert {\psi }_{1}^{(0)}\rangle \vert {0}_{2}\rangle \vert {\eta }_{3}\rangle \vert {\psi }_{4}^{(\pi /2)}\rangle $.

Standard image High-resolution image

The considered initial state of our optical circuit is given by

Equation (11)

This state is a superposition of different four-mode states with different number of photons. For the sake of clarity and simplicity, we rewrite the initial state as

Equation (12)

where $\vert {\psi }_{s}^{\text{I}}\rangle $ is a non-normalized state with the superposition of all the four-mode states with s photons, which reads

Equation (13)

Then, the output state is a superposition of different configurations of how the photons may have arrived to their modes,

Equation (14)

where Cs denotes a particular configuration with s photons. The probability of measuring a configuration Cs is given by

Equation (15)

where Nmin = min (s, N). Next, we will consider particular cases of the maximum number of photons of the state $\left\vert {\eta }_{3}\right\rangle $ (N = 1 and N = 2), and then we will extend them to an arbitrary superposition of Fock states.

3.1. The case N = 1

The unknown state is $\vert {\eta }_{3}\rangle ={r}_{0}\vert 0\rangle +{r}_{1}\enspace {\text{e}}^{\text{i}{\phi }_{1}}\vert 1\rangle $ when N = 1. Firstly, for s = 0 we have only one possible output ${C}_{0}=\left\vert 0000\right\rangle $ with probability given by equation (15),

Equation (16)

where we consider ϕ0 = 0. For s = 1 we have four different outputs, ${C}_{1}\in \left\{\left\vert 1000\right\rangle ,\left\vert 0100\right\rangle ,\left\vert 0010\right\rangle ,\left\vert 0001\right\rangle \right\}$, with the general formula to get the probability given by

Equation (17)

The probabilities are

Equation (18)

Equation (19)

Equation (20)

Equation (21)

Note that the parameters r0, r1, and ϕ1 are encoded in the probabilities. Therefore, these parameters can be obtained by calculating the sum or difference of the probability distributions.

3.2. The case N = 2

In this case, we consider $\vert {\eta }_{3}\rangle ={r}_{0}\vert 0\rangle +{r}_{1}\enspace {\text{e}}^{\text{i}{\phi }_{1}}\vert 1\rangle +{r}_{2}\enspace {\text{e}}^{\text{i}{\phi }_{2}}\vert 2\rangle $; the probabilities for s < 2 are the same as in the previous case. For s = 2, we have

Equation (22)

For = 0 and g + h = 2 − n, the permanent is given by

Equation (23)

For ≠ 0, we have the case g + hd = 2 − n and k + fq = n, where d + q = . Here, the corresponding permanent reads

Equation (24)

where ${B}_{j}^{k}=\frac{j!}{k!(j-k)!}$ is the binomial coefficient. For example, for = 1, $\text{Per}[{{\Lambda}}_{\left\vert 1010\right\rangle ,\left\vert 2000\right\rangle }]$ is given by

Equation (25)

When = 2, ${\text{Per}}_{{{\Lambda}}_{\left\vert 0020\right\rangle ,\left\vert 1001\right\rangle }}$ can be written as

Equation (26)

Then, the probabilities for the configurations $\left\vert 2000\right\rangle $ and $\left\vert 0020\right\rangle $ are given by

Equation (27)

Equation (28)

For the two-photon configuration (s = 2), we can code the different phase differences ϕj ϕk and probabilities amplitudes rj for j, k ∈ {0, 1, 2}.

The same process can be considered for each configuration with a fix number of photons s, and we can find that the unknown phases and amplitudes will be encoded in the different probabilities. This will produce more complex mathematical expressions, thus being hard to calculate the analytical expression to recover some information from the unknown state $\left\vert {\eta }_{3}\right\rangle $. Nevertheless, it looks a suitable problem for ML methods, which can learn and recognize patterns from data. It is important to mention that when the reference states are quantum superpositions $(\left\vert 0\right\rangle +{\text{e}}^{\text{i}\theta }\left\vert 1\right\rangle )/\sqrt{2}$ for θ = 0 and θ = π/2, the mathematical form of the output probabilities are simple. In this case, analytical expressions for all the amplitudes and phases may be found in a straightforward manner, reaching full tomography with 100% accuracy.

4. Entanglement estimation

Entanglement is one of the most important resources for quantum information. It describes nonlocal correlations between quantum states, and it has become an important tool for understanding the states of many-body systems. For bipartite systems, the entanglement entropy has become a theoretical measure for categorizing such states. For the entanglement coding, we used the optical circuit shown in figure 2, where there is an unknown two-mode input state. We will make use of the von Neumann entropy of a state, which is defined as

Equation (29)

where ρ is the density operator of the composite system, while Tr(⋅) denotes the trace.

Figure 2. Refer to the following caption and surrounding text.

Figure 2. Core optical circuit for QT made up of three beam splitters and the initial input state.

Standard image High-resolution image

For the calculation of the initial state, we suppose that the first and last modes are coherent states given by

Equation (30)

with θ = 0 for the mode 1, θ = π/2 for the mode 4 as in the previous case, and a bipartite system for the second and third mode, it means a wave function of the following form

Equation (31)

where the parameters rjv and ϕjv satisfy ${\sum }_{j,v=0}^{N}\hspace{2pt}{r}_{jv}^{2}=1$ and ϕjv ∈ [0, 2π]. Then, the initial state of our optical circuit is

Equation (32)

We describe the bipartite system in modes 3 and 4 by the density operator ρAB = |ψ⟩⟨ψ|, and the reduced density matrix of the subsystem A(B) as ρA = TrB (ρAB ) (ρB = TrA (ρAB )). ${\mathrm{Tr}}_{\mathcal{S}}$ means tracing over subsystem $\mathcal{S}$. The entanglement E(ρAB ) of a bipartite system ρAB may be defined as the von Neumann entropy of the reduced density matrix of a subsystem,

Equation (33)

For simplicity, we rewrite the state given by equation (32) as

Equation (34)

where $\vert {\psi }_{s}^{\text{I}}\rangle $ is a non-normalized state of the superposition of all the four-mode states with s photons, which reads

Equation (35)

Then, the output state is a superposition of the different configurations of how photons may have arrived to output modes,

Equation (36)

where Cs denotes a particular configuration with s photons. The probability of measuring a specific configuration Cs reads

Equation (37)

where s* = sjvn. Now, each set of probabilities is related to a given degree of entanglement. Then, again via the use of an ML protocol based on training a pattern recognition algorithm, we can estimate this essential feature.

5. Machine learning for state characterization

ML was used to find the relation between the patterns and the initial arbitrary state so that the model, once trained, could infer the state of a new pattern not belonging to the training set. In particular, we developed regression models for the values of amplitude and phase, calculating the associated fidelity as figure-of-merit. Due to the sparsity of the data set, support vector regressors (SVRs) were chosen to carry out the regression of amplitudes and phases as our first approach [43, 44]. SVRs are a generalization of support vector machines, which try to solve classification problems by formulating them as convex optimization problems. In this manner, one has to find the suitable hyperplanes that classify correctly as many training samples as possible. In particular, SVRs work by creating a transformed data space in which the problem is more easily solvable and, ideally the problem is transformed into a linear one. That transformation between spaces is carried out by the so-called kernels. Gaussian, linear, and polynomial kernels have been used in this experimentation. An SVR introduces a region in the hyperspace of the problem called epsilon-tube, within which, all predictions are considered as correct.

The first three tables show the results of the predictions made by the ML models, evaluating those predictions by means of the fidelity as figure-of-merit. For each data instance, there is a corresponding fidelity value; the tables show the mean value of the fidelities as well as its standard deviation and the quartiles as measures of dispersal. A percentile X% means that in X% of our data samples, fidelities are below the percentile X% value. For example, in table 1, for the Radial basis function (RBF) (Gaussian SVR), 25% of the samples give fidelities below 0.6397; this of course means that 75% of the results give fidelity values above 0.6397. Tables 4 and 5 report the entanglement estimation. The entropy mean value is given for reference purposes to compare its magnitude and its mean absolute error (MAE). We also give the standard deviation of the absolute errors as well as their quartiles and the coefficient of determination (R2-score) for the models. The coefficient of determination evaluates the quality of the regression, being 1 its best possible value; a constant model that gives the same output value disregarding the input features would have a score of 0.

Table 1. Fidelities achieved by SVRs in the N = 3 case with three different kernels: RBF (Gaussian), linear, and polynomial. The first column specifies the metrics used to assess the fidelities, namely, the mean value alongside the corresponding standard deviation as dispersal measure, as well as the minimum, maximum and quartile values to give information about the distribution of fidelity measures.

 RBFLinearPolynomial
Mean0.742 5300.655 2620.735 201
Standard deviation0.215 0960.237 1840.221 150
Minimum0.072 7210.006 9890.063 000
25%0.639 7210.486 1280.620 836
50%0.807 7030.699 0800.802 010
75%0.914 6780.844 5860.926 584
Maximum0.992 3250.983 4820.991 405

Starting with the specific results of each experiment, table 1 shows the values of fidelity obtained by SVRs implemented with Gaussian (RBF), linear, and polynomial kernels for the case of N = 3. The maximum fidelities are achieved by the Gaussian kernel, that will be hence selected for ulterior analyses. The polynomial kernel obtains slightly lower fidelities than the Gaussian one, whereas the linear kernel shows the poorest modeling capability among the three.

For the sake of comparison with SVRs, we considered a state-of-the-art method, such as extremely randomized trees (ERTs) [45], which build multiple regression trees. Each tree takes a random subset of the input features, while nodes are randomly split for the whole data set (in contrast with the well-known random forest, there is no bootstrap). To reduce the problem dimensionality, we made use of a principal component analysis (PCA) [37]. We set the explained variance ratio at 0.999, so that we could get a good-enough representation of our data that allows for a reliable fidelity modeling while considerably reducing the computational time.

Table 2 shows the fidelity values obtained by SVRs with a Gaussian kernel, for non-PCA and PCA versions of the data sets. The use of PCA seems to have no visible effects on the fidelity performance. In fact, fidelities tend to be slightly higher when using PCA, thus suggesting its suitability for dimensionality reduction. Table 2 also shows that the most reliable fidelities are achieved for N = 2, slightly lower values are obtained for N = 1, and even lower for N = 3.

Table 2. Fidelities achieved by SVRs with a Gaussian kernel for the three different data sets: N = 1, N = 2 and N = 3. Results using a preprocessing based on a PCA and not using PCA are compared. The first column specifies the metrics used to assess the fidelities, namely, the mean value alongside the corresponding standard deviation as dispersal measure, as well as minimum, maximum and quartile values to give information about the distribution of fidelity measures.

Without PCA N = 1 N = 2 N = 3
Mean0.8231510.8504000.742530
Standard deviation0.1904040.2048780.215096
Minimum0.0547450.1058600.072721
25%0.7407020.7873480.639721
50%0.8819100.9500540.807703
75%0.9703620.9832190.914678
Maximum0.9998270.9988610.992325
With PCA N = 1 N = 2 N = 3
Mean0.816 2230.867 9690.743 020
Standard deviation0.221 8730.200 7540.212 160
Minimum0.058 6250.127 3000.135 495
25%0.734 0360.811 4260.636 835
50%0.908 7660.963 3270.783 768
75%0.986 3010.991 3050.912 941
Maximum0.999 9820.998 9940.998 755

Table 3 includes the values of fidelity obtained by ERTs for both non-PCA and PCA versions of the data sets. As in the case of SVRs, the use of PCA does not affect the performances in a great extent. In particular, the use of PCA provides slightly better results for N = 2 and N = 3, while for N = 1 the use of the original data without PCA yields higher values of fidelity. The comparison of tables 2 and 3 show that the ERTs carry out a better regression than SVRs for N = 2 and N = 3, being an SVR approach better for N = 1. This might likely be due to the fact that N = 1 corresponds to a more sparse data set, where the SVR is a more adequate and natural solution. Furthermore, that data set may not contain enough variability to exploit ERT capabilities. In summary, the use of SVRs is suggested for N = 1 and ERTs for any N > 1. Furthermore, as PCA does not have a relevant impact on performance, tends to lead to higher fidelities, and allows a reduction of the computational times, its use to reduce the dimensionality of the data sets is encouraged.

Table 3. Fidelities achieved by ERTs for the three different data sets: N = 1, N = 2 and N = 3. Results using a preprocessing based on a PCA and not using PCA are compared. The first column specifies the metrics used to assess the fidelities, namely, the mean value alongside the corresponding standard deviation as dispersal measure, as well as minimum, maximum and quartile values to give information about the distribution of fidelity measures.

Without PCA N = 1 N = 2 N = 3
Mean0.7198500.9000170.761558
Standard deviation0.2270720.1914050.206226
Minimum0.0692040.1388630.090518
25%0.5577480.9346720.671530
50%0.7513230.9870360.813736
75%0.9165200.9949590.928541
Maximum0.9999960.9998940.988078
With PCA N = 1 N = 2 N = 3
Mean0.664 2260.922 8770.787 241
Standard deviation0.250 0870.157 8770.189 888
Minimum0.066 8880.050 0810.116 078
25%0.467 3510.947 0350.702 872
50%0.703 9730.981 6610.852 074
75%0.892 1370.992 3860.936 313
Maximum0.999 2330.999 7490.994 355

Results about entanglement estimation via the corresponding entropy are shown in tables 4 and 5. The performance of SVRs and ERTs are compared via the MAE and R2 scores for different data sets. ERTs outperform SVRs in all cases except for N = 1, when there is no PCA preprocessing again very likely due to sparsity. This conjecture is reinforced by the fact that after using PCA, hence reducing sparsity, SVRs are not better regressors than ERTs, even for N = 1. As in the estimation of fidelities, PCA does not have a relevant impact on performance. In fact, its use tends to lead to slightly better performances (higher R2-scores and lower MAEs) whilst allowing a reduction of the computational times. Therefore, its use to reduce the dimensionality of the data sets is encouraged. We may infer that, for entanglement estimation, ERTs represent a more adequate choice than SVRs.

Table 4. SVRs with Gaussian kernel: entanglement estimation (by entropy) performance for 200 test samples and the three different data sets: N = 1, N = 2 and N = 3. Results using a preprocessing based on a PCA and not using PCA are compared. The first column specifies the mean values of entropy (as a reference for the committed errors) and the metrics used to evaluate the performance. Namely, MAE with its corresponding standard deviation, as well as the minimum, quartile and maximum values, and also R2-scores.

Without PCA N = 1 N = 2 N = 3
Entropy mean value0.297 3440.297 8220.159 738
MAE0.082 2430.142 8730.118 600
Standard deviation0.067 8440.147 5510.121 629
Minimum0.000 2130.000 8610.000 331
25%0.034 4870.065 3430.074 025
50%0.072 0870.093 3230.096 292
75%0.104 7770.161 5410.115 779
Maximum0.436 4920.934 6301.082 693
R2-score0.818 4070.600 5260.507 575
With PCA N = 1 N = 2 N = 3
Entropy mean value0.297 3440.297 8220.159 738
MAE0.085 9690.132 7190.113 547
Standard deviation0.071 6390.123 1910.122 406
Minimum0.000 9680.003 4290.002 213
25%0.038 1390.063 1350.060 838
50%0.070 7450.095 5760.093 087
75%0.112 7280.147 2610.113 246
Maximum0.518 7100.654 8911.169 980
R2-score0.800 9680.689 2080.522 331

Table 5. ERTs: entanglement estimation (by entropy) performance for 200 test samples and the three different data sets: N = 1, N = 2 and N = 3. Results using a preprocessing based on a PCA and not using PCA are compared. The first column specifies the mean values of entropy (as a reference for the committed errors) and the metrics used to evaluate the performance. Namely, MAE with its corresponding standard deviation, as well as the minimum, quartile and maximum values, and also R2-scores.

Without PCA N = 1 N = 2 N = 3
Real mean0.297 3440.297 8220.159 738
MAE0.081 3030.131 0390.084 841
Standard deviation0.078 9610.143 6800.108 290
Minimum0.000 0180.000 0940.000 005
25%0.018 0780.019 2060.012 170
50%0.054 8990.082 1590.042 033
75%0.130 1610.191 0420.117 126
Maximum0.408 0340.704 9820.573 232
R2-score0.766 0880.627 6550.664 271
With PCA N = 1 N = 2 N = 3
Real mean0.297 3440.297 8220.159 738
MAE0.073 4820.094 4490.060 760
Standard deviation0.073 7040.110 1350.103 725
Minimum0.000 0820.000 0580.000 004
25%0.018 6210.010 9090.007 419
50%0.047 4750.050 5920.026 540
75%0.100 8470.138 2070.075 164
Maximum0.386 1890.484 1200.971 228
R2-score0.822 5160.770 7420.752 039

6. Conclusion

We have proposed a state characterization via a pattern recognition algorithm in ML for photonics. It is based on the estimation of quantum features by using the output photon number distribution in a photonic circuit, similar to boson-sampling protocols for continuous variables. We have focused on the estimation of single-mode phases and amplitudes and, also, on the two-mode entanglement estimation.

SVRs and ERTs have been used to extrapolate the estimation for states not present in the training set, i.e. new probability patterns. The obtained fidelities hint that ML estimations are reliable and can be used to boost the proposed QT protocol. In particular, the use of ERTs with previous PCA preprocessing seems to be a suitable approach for setups with N > 1.

The same two ML approaches were employed for entanglement estimation through modeling the von Neumann entropy, yielding R2 scores higher than 0.75, thus suggesting the appropriateness of ML for this relevant task. In this case, the use of ERTs with a previous PCA preprocessing turns out to be the most adequate choice for all cases.

The proposed photonic circuit with four modes and three beam splitters is experimentally accessible. It means that by using a bigger circuit, with more output probabilities for the number of photons, ML algorithms will likely increase its performance, since the patterns will be more complex and will encode more information. Nevertheless, the numerical generation of the data will be a hard task due to the complexity of the problem. Also, the generation of experimental data for the training set is a demanding task due to the difficulties to characterize a quantum state. In any case, it is a useful technique even if we use simple photonic circuits or historical data produced by current photonic experiments for more complex setups. Finally, this work shows that ML techniques can be suitably used for state characterization without the burden of full tomography, paving the way for more sophisticated tools that may help for fast estimation of quantum features.

Acknowledgments

We acknowledge support from QMiCS (820505) and OpenSuperQ (820363) Projects from EU Flagship on Quantum Technologies, National Natural Science Foundation of China Grant (NSFC) (12075145), Shanghai Government Grant STCSM (2019SHZDZX01-ZX04), Spanish Government Grant PGC2018-095113-B-I00 (MCIU/AEI/FEDER, UE), Basque Government IT986-16, EU FET Open Quromorphic and EPIQUS Projects.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Please wait… references are loading.