Answers>Biology>University>Article

Which two methods of single cell transcriptomic data ordering could be used to understand how gene expression changes as stem cells differentiate.

Clustering involves grouping together genes or cells or both based on similarities. This involves the formation of subpopulations. Pseudotemporal ordering puts cells in order of progression through a biological process rather than time. Network inference looks at genes which are expressed together and extrapolates a potential functional relationship. I believe that for the purpose of analysing the progression of stem cells through their differentiation pathway the combination of pseudotemporal ordering and clustering would be the most informative. In looking at stem cells at specific time points we need to understand that differentiation is a relatively stochastic process with various cells initiating the differentiation process at various time points. As such it would be largely innacurate to look at them solely based on their chronological ordering but rather it would be wiser to look at them following their progression through the stages of differentiation. In order to observe their progression state however we need to first group them. For this we need clustering. Clustering assumes that data is composed of biologically distinct groups whereas pseudotemporal ordering creates a continuous spectrum of a biological process. By combining both we first reorganise the data and separate it out based on biological characteristics and then link it together to illustrate the continuum of differentiation. If we look at pseudotemporal ordering as done by wishbone it relies on k-means clustering. K-means clustering is a flat (non-hierarchical) way to cluster data. The user needs to divide the data into an arbitrary number of groups. Each group then receives a designated centre point. The data points get assigned to various centre-points based on which one they are nearest to. After each data point is assigned to a centre-point their position is averaged generating a new centre-point. The datapoints are reassigned to the nearest centre point and the process is repeated until no further changes occur. The assumption here is that cells in the same stage of progression through differentiation will have similar expression patterns. Then wishbone aligns the pseudotemporal timeline through random walks. It looks at cells with shortest distance to eachother and then links them. Then it highlights the average path of the shortest walks and creates the pseudotemporal timeline. Alternatively there is a different programme that also relies on clustering for pseudotemporal ordering known as mPATH. This one instead of relying on random walks generates the timeline by creating a clusters of cells and then projecting the single cells on the edges between them. It then removes all the week edges. This allows for branched pathways of differentiation and doesn’t assume that all the cells follow the same pathway which is very likely in a mixed cell population. Overall I believe that the combination of clustering and pseudotemporal ordering will give us an insight into how the gene expression of the stem cells changes as their differentiation progresses.

Answered by Alicja B. • Biology tutor

1394 Views

See similar Biology University tutors