NANCY OTERO
  • Home
  • Blog
    • Creature: Machine Learning for Human Learning
    • Final Project Blog
    • Education and AI
    • AI without CS or Math
    • Human and AI learning
  • RESUME
  • Contact
Picture

New data & infrastructure: the Infinite Pipeline

4/5/2019

1 Comment

 
This week I used my new data, 70,000 (from the 300,0000) articles from Instructables to play with ELMO, TF-IDF and perform a Cosine Similarity. The results were much better! While making my data "trainable" I read a lot of articles that explain how cleaning and massaging the data can be 80% of the work in ML. This was a great lesson.

Overall process:
  1. Understand the problem (see previous post)
  2. Decide what data is needed (see previous post)
  3. Research what data is actually available (see previous post)
  4. Get the data (see previous post)
  5. Understand the data
  6. Select labels
  7. Clean the data (separate or remove urls, remove non-useful signs, NaNs, etc)
  8. Preprocesses it (convert it to the right format, in this case)
  9. Design a dataset for it
  10. Load it and store it 
Picture
I used tf.transform, tf.record and tf.example for the last three steps (preprocessing , design a dataset, load it and store it). I picked tf.transform and tf.record (tf.example is part of tf.record), because I wanted to learn something that was easily scalable, could integrate with my colab notebook, allow for parallel processing and monitoring. ​
Picture
Picture
Picture
1 Comment
vidmate.onl link
10/29/2023 10:08:07 am

I wanted to express my gratitude for your insightful and engaging article. Your writing is clear and easy to follow, and I appreciated the way you presented your ideas in a thoughtful and organized manner. Your analysis was both thought-provoking and well-researched, and I enjoyed the real-life examples you used to illustrate your points. Your article has provided me with a fresh perspective on the subject matter and has inspired me to think more deeply about this topic.

Reply



Leave a Reply.

    AI without CS or MathAI without CS or Math

    Human and AI learning 

    Education and AIEducation and AI

Powered by Create your own unique website with customizable templates.
  • Home
  • Blog
    • Creature: Machine Learning for Human Learning
    • Final Project Blog
    • Education and AI
    • AI without CS or Math
    • Human and AI learning
  • RESUME
  • Contact