Loading

Lingua Franca Translation

Getting Started Notebook for Lingua Franca Transalation

A getting started notebook for the challenge.

ashivani

Getting Started with Lingua Franca Translation

In this puzzle, we've to translate to english from crowd-talk lanugage. There are multiple ways to build the language translator:

  • Using Dictionary and Mapping
  • Using LSTM
  • Using Transformers

In this starter notebook, we'll go with dictionary and mapping. Here We'll create dictionary of words for both english and corwd-talk language.

Download the files 💾

Download AIcrowd CLI

We will first install aicrowd-cli which will help you download and later make submission directly via the notebook.

In [ ]:
%%capture
!pip install aicrowd-cli
%load_ext aicrowd.magic

Login to AIcrowd ㊗

In [ ]:
%aicrowd login
Please login here: https://api.aicrowd.com/auth/NPz72ux6cPJoh9ZbLHQWW3v_BO3gSIlOlqpxPVjWbjo
API Key valid
Saved API Key successfully!

Download Dataset

We will create a folder name data and download the files there.

In [ ]:
!rm -rf data
!mkdir data
%aicrowd ds dl -c lingua-franca-translation -o data

Importing Necessary Libraries

In [ ]:
import os
import pandas as pd
import gensim
from sklearn.metrics.pairwise import cosine_similarity

Diving in the dataset:

In [ ]:
train_df = pd.read_csv("data/train.csv")
In [ ]:
train_df.head()
Out[ ]:
id crowdtalk english
0 31989 wraov driourth wreury hyuirf schneiald chix lo... upon this ladder one of them mounted
1 29884 treuns schleangly kriaors draotz pfiews schlio... and solicited at the court of Augustus to be p...
2 26126 toirts choolt chiugy knusm squiend sriohl gheold but how am I sunk!
3 44183 schlioncy yoik yahoos dynuewn maery schlioncy ... the Yahoos draw home the sheaves in carriages
4 19108 treuns schleangly tsiens mcgaantz schmeecks tr... and placed his hated hands before my eyes
In [ ]:
english = train_df.english.values
crowdtalk = train_df.crowdtalk.values
In [ ]:
english
Out[ ]:
array(['upon this ladder one of them mounted',
       'and solicited at the court of Augustus to be preferred to a greater ship',
       'but how am I sunk!', ..., '“But my toils now drew near a close',
       'going as soon as I was dressed to pay my attendance upon his honour',
       'for there was no sign of any violence except the black mark of fingers on his neck.'],
      dtype=object)
In [ ]:
processedLines = [gensim.utils.simple_preprocess(sentence) for sentence in english]
eng_word_list = [word for words in processedLines for word in words]
In [ ]:
processedLines = [gensim.utils.simple_preprocess(sentence) for sentence in crowdtalk]
crowdtalk_word_list = [word for words in processedLines for word in words]
In [ ]:
dict1 = dict(zip(crowdtalk_word_list, eng_word_list))

Prediction Phase ✈

In [ ]:
test_df = pd.read_csv("data/test.csv")
In [ ]:
test_df.crowdtalk[3984]
Out[ ]:
'zoetz treiahl typeauty squiend sriohl daonts schloors rhiuny'
In [ ]:
crowdtalk = test_df.crowdtalk.values
In [ ]:
processedLines = [gensim.utils.simple_preprocess(sentence) for sentence in crowdtalk]

Creating sentences by matching english word corresponding the new langauge word in the sentence using the dictionary mapping created.

In [ ]:
sentence = []

for i in processedLines:
  sentence_part = []
  word = ''
  for j in i:
    if j in dict1:
      word = ''.join(dict1[j])
    else:
      word = ''.join(' ')
    sentence_part.append(word)
    temp = ' '.join(sentence_part)
  sentence.append(temp)
In [ ]:
test_df['prediction'] = sentence
In [ ]:
test_df.head()
Out[ ]:
id crowdtalk prediction
0 27226 treuns schleangly throuys praests qeipp cyclui... of fingers that reduce and or mischief drew ne...
1 31034 feosch treuns schleangly gliath spluiey gheuck... the of fingers the he spoke much deeper resemb...
2 35270 scraocs knaedly squiend sriohl clield whaioght... leagues to resemblance between same thrush the...
3 23380 sqaups schlioncy yoik gnoirk cziourk schnaunk ... great and the at that unhappiness concealing m...
4 92117 schlioncy yoik psycheiancy mcountz pously mcna... and the them respite do which was them

Saving the prediction in the asset directory with the same as submission.csv.

In [ ]:
!rm -rf assets
!mkdir assets
test_df.to_csv(os.path.join("assets", "submission.csv"), index=False)

Submitting our Predictions

Note : Please save the notebook before submitting it (Ctrl + S)

In [ ]:
%aicrowd notebook submit -c lingua-franca-translation -a assets --no-verify
Using notebook: getting-started-notebook-for-lingua-franca-transalation.ipynb for submission...
Scrubbing API keys from the notebook...
Collecting notebook...


                                                       ╭─────────────────────────╮                                                       
                                                       │ Successfully submitted! │                                                       
                                                       ╰─────────────────────────╯                                                       
                                                             Important links                                                             
┌──────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│  This submission │ https://www.aicrowd.com/challenges/ai-blitz-xii/problems/lingua-franca-translation/submissions/169598              │
│                  │                                                                                                                    │
│  All submissions │ https://www.aicrowd.com/challenges/ai-blitz-xii/problems/lingua-franca-translation/submissions?my_submissions=true │
│                  │                                                                                                                    │
│      Leaderboard │ https://www.aicrowd.com/challenges/ai-blitz-xii/problems/lingua-franca-translation/leaderboards                    │
│                  │                                                                                                                    │
│ Discussion forum │ https://discourse.aicrowd.com/c/ai-blitz-xii                                                                       │
│                  │                                                                                                                    │
│   Challenge page │ https://www.aicrowd.com/challenges/ai-blitz-xii/problems/lingua-franca-translation                                 │
└──────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
In [ ]:


Comments

You must login before you can post a comment.

Execute