Data Visualization PCA vs MDS

En curso Publicado hace 7 años Pagado a la entrega
En curso Pagado a la entrega

The aim of this application is to collect data on how users prefer to visualize multidimensional data, reduced to 2D using PCA or using MDS.

PCA= Principal component Analisis.

[url removed, login to view]

MDS =Multi Dimentional Scaling.

[url removed, login to view]

The program has to be code in R.

The program has to contain comments in English and be readable.

For a list of CSV files (at the hard disk) all data is numerical.

This list has to be easy to modify

The computer has to load the each data set into memory and then:

Use k-means clustering

[url removed, login to view]

Classify all samples.

Run the clustering multiple times with K = 3 to 8 and use silohuete score to select the best k value.

[url removed, login to view]

[url removed, login to view](clustering)

Then use PCA, select the first two principal components, and make a graph.

Also use MDS to create a 2d graph.

Then calculate the silohuete score for each graph. (over 2d)

[url removed, login to view]

[url removed, login to view](clustering)

and olso over the original clustering.

Use Euclidean distance (this should be the default)

The save a table with the silohuete scores:

Sil = silohuete score.

FileName 1)Sil for clustering, 2) sil for PCA 2d, 2) sil for MDS.

The table should be saved in CSV format.

Then save everything in files (the 2d graphs for each data set) in the hard disk and have another program to collect user studies from those files.

COLECTING USER STUDIES:

Record the following questions:

How old are you? (read number)

Did you work as a data sciencietist? (read text)

What is your name (read textg)

([url removed, login to view])

The next step is for each pair of graphs generated with PCA and MDS.

Use the clustering results to choose the colour of the points on the graph.

Show both of them at the same time (for the same data set). In random order.

And collet from the key board two numbers,

The rating for the first graph (1 to 10), and the rating for the second graph (1 to 10). No need to check the numbers just to collect them from keyboard.

This looks keep repeat until the age of a participant is [url removed, login to view] each participant for each pair graph the participant answer plus the order of the graph PCA,MDS or MDS,PCA., and the response to the questions.

Lenguaje de Programación R

Nº del proyecto: #10367088

Sobre el proyecto

3 propuestas Proyecto remoto Activo hace 7 años

Adjudicado a:

instantProgr

Dear Madam or Sir, I have great experience in machine learning, especially in R. I have implemented the methods (PCA, MDS, k-means) from scratch in R, Python, and C. Hence, I know the intricate details of the algori Más

€23 EUR en 1 día
(0 comentarios)
0.0

3 freelancers están ofertando un promedio de €78 por este trabajo

sortida88

Hi. We are 3 PhD students in Statistics (Dr. Torretta became a PhD three weeks ago). We have a good background (Excel, R, STATA, SAS and SPSS and Latex) in every direction which involves Statistics. We'll be glad to wo Más

€100 EUR en 30 días
(0 comentarios)
0.0