Find Jobs
Hire Freelancers

Develop Text Classification and/or Clustering Algorithms in Python

$250-750 USD

Terminado
Publicado hace casi 8 años

$250-750 USD

Pagado a la entrega
We require assistance on the following tasks. Please contact us directly to describe how you would solve them. Russian language skills may be necessary. 1) Task: Develop/employ a text-classification algorithm in Python or R that classifies items as one of several thousand 10-digit product codes using a descriptive text field of roughly 300 characters in UTF-8 (Russian / Cyrillic). Description: We have a database of several million textual descriptions of products that have been entered by humans. Each entry is connected to a 10 digit product code, but the same product code can be used for multiple differing textual entries. We require a text-classification algorithm that probabilistically classifies a document that can then be applied to another dataset (see task 2). This task requires tokenizing, stemming, and removing stop words, and therefore you may need to know Russian or to use available NLTK packages. Similarly, several different algorithms may need to be used to improve precision. Output: Python scripts/algorithm(s) classifying documents into 10-digit product codes that can be used in task 2. 2) Task: Use the classification algorithm in (1) to classify textual entries in a second dataset. Description: Once the clean list has been created, employ a machine learning algorithm to assign the 10 digit codes to a target dataset of over 60 million textual product descriptions in UTF-8 (Russian / Cyrillic). Not all entries will have sufficient information to be classified ('leftovers') and should be marked so. For example, this could be done if no classification has a probability above some threshold. Also, the dataset in (1) only contains examples of a subset of the items in the second dataset, but we will be able to estimate which items these are. Output: Second dataset of 60 million entries are matched to 10 digit product codes. 3) Task: For the 'leftovers' of (2), develop/employ a text clustering algorithm that groups entries in k subclasses Description: We will provide you a higher-level grouping variable for the 'leftovers' and a number k that designates how many we clusters need within each grouping. Your task will be to use a text clustering algorithm to create k amount of 'clusters' within the higher groups for the 'leftovers'. Output: A unique variable designating cluster membership for each item in the 'leftovers' (those without 10 digit product codes from step 2).
ID del proyecto: 10479220

Información sobre el proyecto

18 propuestas
Proyecto remoto
Activo hace 8 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
Adjudicado a:
Avatar del usuario
Hello! My name is Andrey. I'm physicist from Russia with experience in machine learning field. I know how to implement ML methods in practice. For example, I developed predictive algorithm for sport betting. You can find additional information about this job in my Upwork profile. Also I have experience in text classification. I developed model for classification of wikipedia articales at my work. Also you can find me on kaggle.com. My nick is gradiente. But first of all I have to estimate amont of work and see text samples for classifiacation. Hope to work with you under this project!
$611 USD en 20 días
5,0 (6 comentarios)
5,5
5,5
18 freelancers están ofertando un promedio de $1.329 USD por este trabajo
Avatar del usuario
HI there. I would love to be part of this project as it seems very interesting. I am a data scientist with experience applying data mining algorithms to large amounts of data for prediction and description. I do not have knowledge of russian language, but I do have experience using already developed packages to pre process data. I would do all tasks in python. Hope to hear back from you soon. Thanks, Daniel
$526 USD en 10 días
4,9 (101 comentarios)
7,8
7,8
Avatar del usuario
We are a group of Data Scientists based in Bangalore. Our core areas of expertise are big data and machine learning.
$10.000 USD en 40 días
4,9 (9 comentarios)
6,4
6,4
Avatar del usuario
I am a computer science professional with a PhD degree and excellent skills in Python and a number of other languages. I've done many projects involving Clustering or Classification. I'm also a fluent Russian speaker. Please see reviews on my profile. It would be my pleasure to do your project. Here is another large project in which I had to process a large volume of texts in Russian using Python: https://www.freelancer.com/projects/Python/Data-Extraction-from-Word-documents/
$1.000 USD en 10 días
5,0 (59 comentarios)
6,3
6,3
Avatar del usuario
I am very interesting in your project. I have experience in this field. If you work with me, you will get success. I am ready to work with you now. Phon.
$736 USD en 10 días
4,9 (25 comentarios)
5,8
5,8
Avatar del usuario
Dear Client, Greetings from Flowgica technologies, I have experience with these skills. We do have similar experience therefore I am looking forward to discuss and move ahead. please check our freelancer portfolio at https://www.freelancer.com/u/mmadi.html?page=portfolio I am ready to work with you,kindly waiting for your response. Thanks & Regards, Mmadi
$600 USD en 10 días
5,0 (1 comentario)
4,0
4,0
Avatar del usuario
My name is Mike and I’m from UK. I work with individual clients and also provide outsourcing services for a number of UK and USA based agencies. Your project description sounds interesting to me and I do have skills & experience that is required to complete this project. I can show you some examples of my work. Please contact me to discuss your project.
$555 USD en 10 días
5,0 (1 comentario)
3,2
3,2
Avatar del usuario
i have gone through your requirement we done similar kind of job before looking forward your earliest Reply on this for a project discussion Awaiting for your earliest reply
$555 USD en 10 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Hello, I understood the initial scope of this project. Although i want to discuss further this job in order to prepare the final concept for this project. After Complete discussion over the call or in chat, i will prepare following things for you - Technical Project Proposal - Flow chart for this Project - Execution plan (Step by step procedure with explanation how and at what that we are going to execute a particular task)
$773 USD en 20 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
Currently Im working part time, where Im using R on daily basis. I have practical experience with R programming and also with classification algorithms, text mining, clustering and machine learning. Im also student in the field of Economics and Econometrics in Praque.
$1.666 USD en 10 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
A proposal has not yet been provided
$1.111 USD en 21 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
Washington, United States
5,0
6
Forma de pago verificada
Miembro desde ene 6, 2016

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.