About | Deep Text

Why two textboxes?

We use two text boxes for the convenience of users. In the first box, the program shows some possible correct spellings for words marked as incorrect. During this suggestion-generation process, the spell checker has to move back and forth through multiple loops containing many logic rules (sets of conditions and steps) and arguments (values passed through parameters), while comparing the results each time with billions of words in the database. This makes the overall process slower.

For this reason, we also need to limit the number of words that can be checked at one time. To address this, we have introduced a second text box as an alternative. In this box, spell checking is performed much faster because suggestion generation part is skipped here.

Users who want to check spelling quickly—especially for larger amounts of text and simply to spot misspellings—should use the second text box.
Is the spell checker marking a very common word as incorrect?

If common words such as আমি (in Assamese) are marked in red, it may be because the program has been opened in the browser for the first time and is still loading the necessary data. In such cases, you can refresh the browser or close the tab and reopen the site.

Another possible reason may be a slow internet connection. A slower connection can affect the program’s performance, since the spell checker has to check billions of words in just a fraction of a second.

For technical reason.
If common words containing the special Assamese–Bengali characters য়, ড়, ঢ়, such as দয়াময়, গৌড়ীয়, পঢ়ুৱৈয়ে (in Assamese), are marked in red, it may be due to a technical reason. The Unicode system allows two different ways of writing each of these characters. One method uses a single glyph in which the bottom dot is inseparable from the base character. In the other method the characters are written as combination of two glyphs, where the dot is added separately to the base letter. This difference creates difficulties when processing words efficiently. Our program uses the single (inseparable) glyph form. However, if a word is typed using the alternative form (base character + dot), the program may fail to recognize it as the same character, eventually marking the word as an error. Unfortunately, Unicode system has left many issues in Assamese-Bengali script including not recognizing the scripts appropriately.
Is this the first-ever real-time spell checker for any Indian language?

We are quite aware of several such efforts—both in India and abroad—that aimed at developing a high-quality and real-time spell checker for Indian languages. However, none of these efforts have yet met with considerable success. The difficulties are both technological and linguistic, with the linguistic challenges being particularly significant. Indian languages are highly inflected. Even a single root word can produce thousands of inflected forms through the addition of various inflectional morphemes—for example:
and so on.

As expected, we use morphological and orthographical parsing algorithms to analyze and generate these forms. However, the methods of generating inflected words are so diverse and the number of possible forms so large that the algorithms alone often cannot help effectively.

For this possible reason, even big technology companies like Microsoft and Google have yet to come up with a complete full spell check functionality for Indian languages. Of course, we do not claim that ours is the first perfect or complete spell checker (for any Indian language). Instead, we believe it is probably the first real-time spell-checking program (for any Indian language) to achieve considerable success in both technological and linguistic terms.
Can I check the spelling of non-Unicode text, such as text in geetanjali font (in old encoding)?

You cannot directly check non-Unicode text. But you can use the Text Converter to convert them to Unicode text and then paste them in to check the spellings.
Can I contact the developer if I have a question?

You can contact us at eduserservices@gmail.com or use the YouTube link given below and leave a comment in the comment section.

हमारे बारे में

कई वर्षों तक गंभीर तकनीकी और भाषाई चुनौतियों से जूझने के बाद, हम (Eastern Digital) अंततः भारतीय भाषाओं के लिए यह स्मार्त स्पेल-चेकर प्रोग्राम विकसित करने में सफल हुए हैं। हमें विश्वास है कि यह हिंदी, असमिया, बंगाली, ओडिया और अन्य भारतीय भाषाओं में वर्तनी को मानकीकृत करने की दिशा में एक अभूतपूर्व योगदान देगा। संभवतः यह किसी भी भारतीय भाषा के लिए बनाया गया पहला पूर्णतः विकसित और रियल-टाइम स्पेल-चेकर है। इस परियोजना में सबसे पहले पूरी तरह तैयार की गई भाषा असमिया है। भारतीय भाषाओं में सबसे ज़्यादा शब्द (क्रियाओं और संज्ञाओं में विभक्ति संयोग के कारण) होने के बावजूद पहली पूरी तरह से तैयार भाषा असमिया है। हमारी सबसे बड़ी चुनौती सभी संभावित और सही प्रत्यय-विभक्तियुक्त शब्द-रूपों को तैयार करना था— जिनकी संख्या कल्पना से परे है। अब तक असमिया भाषा के 4.5 बिलियन से अधिक शब्द-रूपों को शामिल किया जा चुका है, हालांकि यह अभी भी पूर्ण नहीं है। हमें उम्मीद है कि छूटे हुए शब्द-रूपों को धीरे-धीरे शामिल किया जा सकेगा। इसमें मौजूद टैकनोलजी युजर इनपुट पर लगातार नज़र रखता है और उसी के आधार पर स्वतः निर्णय लेकर स्वयं को बेहतर बनाता रहता है। हम उपयोगकर्ताओं से विनम्र अनुरोध करना चाहेंगे कि अगर स्पेल चेकर किसी शब्द को सही मार्क नहीं करता है, तो इसका मतलब यह नहीं है कि उसकी स्पेलिंग गलत है। जिस प्रकार गलत वर्तनी की संभावनाएँ अनंत हैं, उसी प्रकार नए और सही शब्दों के निर्माण की संभावनाएँ भी अनंत हैं। शब्दों की इस विशाल और निरंतर विकसित होती दुनिया को पूरी तरह समेट पाना लगभग असंभव है। फिर भी, हमें आशा है कि यह स्पेल-चेकर आपका एक भरोसेमंद साथी और मार्गदर्शक सिद्ध होगा।

अन्य भाषाएँ: हम शीघ्र ही हिंदी को शामिल करेंगे, उसके बाद क्रमशः बंगाली और ओडिया को जोड़ा जाएगा।

हमारा संघर्ष केवल तकनीकी और भाषाई चुनौतियों तक सीमित नहीं रहा— पूरे विकास काल में हमें अत्यंत गंभीर वित्तीय कठिनाइयों का भी सामना करना पड़ा। यदि आप अत्यधिक विभक्तियुक्त भारतीय भाषाओं के लिए एक व्यापक और उच्च-गुणवत्ता वाला स्पेल-चेकर विकसित करने की जटिलता को समझते हैं और हमारे परिश्रम से तैयार किए गए इस ऐप की सराहना करते हैं, तो आप नीचे दिए गए विवरण के माध्यम से हमारा समर्थन करने पर विचार कर सकते हैं। आपका सहयोग इस परियोजना को जीवित रखने में हमारी सहायता करेगा। एक छोटा-सा योगदान भी हमारे लिए अत्यंत मूल्यवान है।

আমাৰ বিষয়ে

আমি য়ুজাৰৰ সুবিধাৰ বাবে দুটা text box ব্যৱহাৰ কৰিছোঁ। প্ৰথম বাকচটোত ই ভুল বুলি চিহ্নিত কৰা শব্দৰ সম্ভাব্য শুদ্ধ ৰূপ কিছুমান দেখুৱায়। এই Suggestion generation প্ৰক্ৰিয়াটোত Spell checker program-টোৱে অনেকটা logic (set of rules, conditions, steps, etc.) আৰু argument (values passed through the parameters) থকা loop-ৰ মাজত ঘূৰি-পকি সেইবোৰ প্ৰস্তুত কৰিব লগীয়া হয়। এইটোৱে গোটেই প্ৰক্ৰিয়াটো লেহেমীয়া কৰি তোলে। এই কাৰণতে আমি ইয়াত এবাৰত পৰীক্ষা কৰিব পৰা শব্দ সংখ্যাও সীমিত ৰাখিব লগীয়া হৈছে। গতিকে আমি য়ুজাৰক বিকল্প হিচাপে দ্বিতীয়টো textbox দিছোঁ, য’ত ভুল বুলি চিহ্নিত শব্দৰ বিপৰীতে suggestion generation কৰা নহয় বাবে যথেষ্ট খৰতকীয়াকৈ spell checking সম্পাদন হয়। যিসকলে সৰহীয়া text আৰু খৰতকীয়াকৈ কেৱল ভুলবোৰ চকুত পৰিবৰ বাবে spell checking কৰিব বিচাৰে তেওঁলোকে এই দ্বিতীয় বাকচটো ব্যৱহাৰ কৰা উচিত।

আন ভাষা: অলপতে আমি ইয়াত হিন্দী আৰু তাৰ পিছতে ক্ৰমে বাঙালী, ওড়িয়া সন্নিৱিষ্ট কৰিম।

আমাৰ সংগ্ৰাম কেৱল প্ৰযুক্তিগত আৰু ভাষিক প্ৰত্যাহ্বানবোৰ অতিক্ৰম কৰাতে সীমাবদ্ধ নাছিল— আমি সমগ্ৰ সময়ছোৱাত কল্পনাতীত আৰ্থিক অসুবিধাৰ সন্মুখীন হৈছিলোঁ। যদি আপুনি অতি জটিল ৰূপতাত্ত্বিক বৈশিষ্ট্যৰ ভাৰতীয় ভাষাবোৰৰ বাবে এটা পূৰ্ণাংগ আৰু মানসম্পন্ন বৰ্ণাশুদ্ধি পৰীক্ষক তৈয়াৰ কৰাৰ কঠিনতাক বুজিব পাৰিছে আৰু এই কষ্টসাধ্য প্ৰগ্ৰামটোৰ শলাগ লৈছে, তেন্তে তলত দিয়া সবিশেষ ব্যৱহাৰ কৰি আমাক (নিঃচৰ্তভাৱে) কিছু সহায় কৰাৰ কথা চিন্তা কৰিব পাৰে। ই আমাক এই প্ৰকল্পটোক জীয়াই ৰখাত সহায় কৰিব। যিকোনো সৰু পৰিমাণো আমাৰ বাবে বহুত।

Support Our Work

If you find this work useful, you may support us using the following methods:

Method 1:

(UPI)

Method 2:

(PayPal) edmanager21@gmail.com (India)