MCMC paper finished

September 30, 2009

Abstract:

Many nuclei probed by NMR are relatively insensitive to detection, requiring methods such as the Carr-Purcell Meiboom-Gill (CPMG) pulse sequence. Experiments which follow this general approach are composed of pulse trains, giving rise to characteristic spikelet patterns in the frequency domain. In the presence of multiple underlying chemical sites, each spikelet intensity is a sum of some unknown proportion of contributions from each site. This work outlines a modeling approach based around Markov Chain Monte Carlo (MCMC), which negates the need for intensive simulations using density matrix formalism. In support of this technique, a spikelet pattern is produced using the density matrix formalism for an ensemble of spin 1/2 nuclei, and the underlying chemical shifts and intensities reproduced using the method outlined. Finally, MCMC is used to model the CPMG spectrum of a (3,3,3-trifluoropropyl)dimethylchlorosilane (TFS) treated aluminosilicate, providing evidence in support of a particular model of silanol group surface attachment to the bulk.


CPMG simulation/modeling

February 23, 2009

A script for simulating the CPMG sequence using the density matrix, with Gauss/Gauss envelope/spikelet broadening, for a forthcoming paper.


function [tot, gbb,t]=d_cpmg(N,tau,n,delta,r,rr)

%cpmg experiment evolution wjb 02/09
%simple/ideal pulse sequence:

%90y-n*[-180x-]

%N time steps
%tau
%n 180 loops
%delta chem shift
%r envelope gauss br^2
%rr spikelet gauss br^2

%matrix for I_x & I_y

a=[0 1/2; 1/2 0]; b=[0 -i/2; i/2 0];

%time step & initial rho

t=tau/(N-1); rho=a;

for k=1:n
sig(1)=trace(rho*a); sigi(1)=trace(rho*b);

for j=2:N/2

%iterate; free precession for tau/2

rho = [exp(-i*t*delta) 0; 0 exp(i*t*delta)]*rho*[exp(i*t*delta) 0; 0 exp(-i*t*delta)];

sig(j)=trace(rho*a); sigi(j)=trace(rho*b);

end

%apply 180x

rho = [0 exp(i*pi/2); exp(i*pi/2) 0]*rho*[0 exp(-i*pi/2); exp(-i*pi/2) 0];

%iterate; free precession for tau/2

sig(N/2+1)=trace(rho*a); sigi(N/2+1)=trace(rho*b);

for j=2:N/2

rho = [exp(-i*t*delta) 0; 0 exp(i*t*delta)]*rho*[exp(i*t*delta) 0; 0 exp(-i*t*delta)];

sig(j+N/2)=trace(rho*a); sigi(j+N/2)=trace(rho*b);
end

if k>1

tot=[tot (sig+i*sigi)];

else

tot=(sig+i*sigi);
end

end

tt=-tau/2:t:t*(N-1); gb=exp(-rr.*tt.^2); gb = [gb(N/2+1:N) gb(1:N/2)]; gbb=gb;

for i=1:n-1

gbb=[gbb gb];

end

t=0:t:t*(n*N-1); gbb=gbb.*exp(-r.*t.^2);


cpmg1


Quick tour through Hilbert space

February 4, 2009

Before part III of forecasting with RSS+SVM+wavelets I thought it would help to give some useful concepts from Hilbert space. Attached is a very rough look, and also an application using orthogonal functions to model a stationary signal (Fourier series). This will contrast nicely with wavelets, which are most useful for non-stationary signals eg., stock indices.
hilbert space overview
Fourier series example


Forecasting w/ RSS feeds + SVM + Wavelets

January 29, 2009

You know the words, but probably haven’t seen them in the same sentence. Simply put, I’m going to wax lyrical about the causal relationship between news and certain types of stock index, illustrated thus:
diag_wave3
There are lots of assumptions here, including the efficiency and transparency of the market, and the assumption that the investor doesn’t have inside knowledge. I also assume that the investor is only informed this way, he/she has no conception of real intrinsic value until such times as he/she is informed via company reports and the like. So a good candidate index would be a tech stock, where the actual commodity might be ambiguous and the index is heavily manipulated by opinion over real worth. To summarize, the figure relates to a public company whose index is a strong function of investor feedback from news. We would like to exploit this fact, for the class of companies for which this might be true, by using machine learning to parse news and generate a signal which is some function of the company’s index. We are relying on the power of the written word to influence events far into the future.

For example, consider a fictitious tech consultancy (offering the somewhat ambiguous commodity of ‘useful information’) who features regularly in your favorite tech RSS feed. Let’s naïvely assign a value of +1 to a positive statement, -1 to a negative statement. You may read at alternate times:

That’s preposterous, what they offer is useful
That’s useful, what they offer is preposterous

Both statements convey both positive and pejorative messages directed at different parties, yet are composed of the same words. So in classifying a statement using machine learning on your PS3 yellow dog cluster, we must come up with both a useful dictionary and a means to encode context, even before assigning value to words. A naïve assignment of value to words gives each statement a sum total of 0, even though they convey drastically different opinions. Further, frequency can be useful, consider:

That’s preposterous, what they offer is very, very useful

Obviously ‘very’ is used to provide emphasis and therefore frequency of words has bearing also. Consider finally the statement:

CEO Joe Blo will have neurosurgery on June 11

Since the company is basically selling information, the CEO’s power to provide information is going to change drastically after 06/11. Thus there is some uncertainty, and we would expect sentiment to sway between negative and positive in the interim ie., words may take a ‘value’ in the continuum between +/- 1.

Finally, some measure of the reliability or impact of the news source proves helpful in weighting this source against others. This ‘impact factor’ could be simply determined from website volume, or from something more complicated like a Markov Chain ranking algorithm. To summarize then, in order to classify text and generate ‘signal’ from a RSS feed, at the very least we require:

  • a dictionary
  • a means to encode word:
    • value
    • context
    • frequency
  • Impact of the news source

How we classify the very large amount of information produced after this manner to produce useful signal is a trickier topic requiring a little geometry and perhaps Bayes. To get the creative juices bubbling, here’s a couple of lines of code to play with in bash:

# date | cut -c12-19 > foo.txt

#curl –silent ‘http://rss.slashdot.org/Slashdot/slashdot’ | awk ‘{for (i=1;i<=NF;i++) { if ($i==”is”) {print NR,i}}}’ >> foo.txt

Next time I’ll provide a more rigorous example and show how we can produce useful signal from an ensemble of SVM’s. (In the meantime you may want to check out SVM Light). Last but not least, I’ll go over ideas/objects from Hilbert space including completeness and wavelets, and how we can ultimately use some math with our signal for robust time series prediction.

ED: go easy with curl, you don’t want to come across as a robot and get your IP/domain banned :)


Complementarity

September 26, 2008

Here’s an extract from an old essay of mine, from way the heck back in 1995, submitted in PH237 at UQ, document link follows:

Bohr’s well documented opposition to Einstein’s corpuscular theory of light abated in lieu of the Bothe- Geiger experiments, in which the particle nature of radiative phenomena manifested itself. These results were in blatant contradiction with the Bohr-Kramer-Slater interpretation of the interaction between atomic systems. The B-K-S paper assumed that the radiative aspects of atomic transitions were solely describable in terms of the “wave picture”. It was this descriptive contrast which prompted Bohr to find
a harmonious relationship between the particle-wave nature of the radiative aspects of quantum interactions.

Essay