Hello, My name is
Md. Shahidul Salim
Lecturer,CSE,KUET
And I'm a
My CV My LinkedIn Profile My Github Profile My Google Scholar Profile My Hugging Face Profile

About me

I'm Shakib and I'm a

I am currently employed as an Lecturer in the Department of Computer Science and Engineering at Khulna University of Engineering & Technology, Bangladesh. I graduated from Khulna University of Engineering and Technology with a computer science and engineering degree. My CGPA is 3.86 out of 4.0. I have achieved the fourth position out of 121. My research endeavors primarily revolve around the domains of Machine Learning and Natural Language Processing. My scholarly contributions include several published journal article as well as a substantial body of work presented through conference papers. Additionally, I currently have several under-review journals (Data in Brief, Engineering Applications of Artificial Intelligence journal), and I am actively engaged in ongoing research initiatives.

more

Research Area

Machine Learning
✅Convolutional Neural Networks
✅Time series analysis
✅Multimodal(text+image)
Natural Language Processing
✅LLM
✅Transformer
✅Bangla stemmer
✅Text generation
✅Text summarization
✅Conversational question answering
Artificial Intelligence
✅AI vs Human text classification

Publication and Under Review

Natural Language Processing

  1. Md. Shahidul Salim, Sk Imran Hossain, An Applied Statistics dataset for human vs AI-generated answer classification, Data in Brief, 2024, 110240, ISSN 2352-3409,https://doi.org/10.1016/j.dib.2024.110240. 🌐paper 🔗Github

  2. Nabil, A., Das, d., Salim, M. S., Arifeen, S., & Fattah, H. M. A. (2023). Bangla emergency post classification on social media using transformer based bert models. In 6th international conference on electrical information and communication technology (EICT 2023). (Accepted) 🌐paper

  3. Salim, M. S., Murad, H., Das, D., & Ahmed, F. (2023). "BanglaGPT: A Generative Pretrained Transformer-Based Model for Bangla Language," 2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Dhaka, Bangladesh, 2023, pp. 56-59, doi: 10.1109/ICICT4SD59951.2023.10303383.🌐paper

  4. S. Salim, T. Islam, R. Zannat, N. Mia, M. Fuad and H. Murad, "Towards Developing a Transformer-Based Bangla Typing Error Correction Model: A Deep Learning-Based Approach," 2023 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD), Dhaka, Bangladesh, 2023, pp. 75-78, doi: 10.1109/ICICT4SD59951.2023.10303361.🌐paper

  5. T. Ahmed, S. Hossain, M. S. Salim, A. Anjum and K. M. Azharul Hasan, "Gold Dataset for the Evaluation of Bangla Stemmer," 2021 5th International Conference on Electrical Information and Communication Technology (EICT), Khulna, Bangladesh, 2021, pp. 1-6, doi:10.1109/EICT54103.2021.9733662.🌐paper

  6. M. S. Salim Shakib, T. Ahmed and K. M. Azharul Hasan, "Designing a Bangla Stemmer using rule based approach," 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), Sylhet, Bangladesh, 2019, pp. 1-4, doi: 10.1109/ICBSLP47725.2019.201533.🌐paper

Machine Learning

  1. R. T. H. Promi, R. A. Nazri, M. S. Salim and S. M. T. U. Raju, "A Deep Learning Approach for Non-Invasive Hypertension Classification from PPG Signal," 2023 International Conference on Next-Generation Computing, IoT and Machine Learning (NCIM), Gazipur, Bangladesh, 2023, pp. 1-5, doi: 10.1109/NCIM59001.2023.10212940.🌐paper

  2. Hossain, L., Hossain, I., Salim, M. S., Raju, S. M. T. U., & Saha, J. (2023). A novel technique for classification of motor imagery EEG signal based on deep learning approaches. In Proceedings of the 2nd international conference on big data, IoT and machine learning (bim 2023). (Accepted)🌐paper

  3. Ashiqussalehin, M., Jahan, K., Rahaman, M., & Salim, M. (2022). Human Abnormal Behavior Detection Using Convolution Neural Network. Specialusis Ugdymas, 1(43), 4076–4083.🌐paper

Under Review

  1. BCoQA: Benchmark and Resources for Bangla Context-based Conversational Question Answering(Submitted to EMNLP 2024)
  2. 📝Developing a Bangla Context-based Conversational Question Answering (CCQA) system faces challenges like limited domain-specific data, poor translation methods, and a lack of pretrained language models. To tackle these issues, this work constructs a robust Bangla CCQA dataset using quality-controlled machine translation and large language model-based augmentation of existing English CCQA datasets. The dataset is then divided into training, validation, and test splits. Various sequence-to-sequence models are fine-tuned and evaluated using the train and test splits, with conversation history included in the input prompts to maintain context.
  3. Bangla news article dataset(Submitted to Data in Brief)
  4. 📝In this research, we present an updated standard Bangla dataset based on gathered Bangla news articles. In total, more than 1.9 million articles from nine Bangla news websites were gathered; the selection process was led by a number of categories, including sports, economy, politics, local news, tech, tourism, entertainment, education, health, the arts, and many more. The dataset per newspaper contains varying attributes, such as title, content, time, tags, meta, category, etc. This dataset will enable data scientists to investigate and assess theories related to Bangla natural language processing. Furthermore, there is a greater chance that the dataset will be utilized for domain-specific large language models in the context of Bangladesh, and it may be used to develop deep learning and machine learning models that categorize articles according to subjects.
  5. LLM based QA chatbot builder: A generative AI-based chatbot for question answering(Submitted to softwareX)
  6. 📝This software describes the development of a web application called the LLM QA Builder, designed to streamline the creation of LLM-based interactive chatbots for organizational information retrieval. The application integrates various development phases including data collection and preprocessing, LLM fine-tuning, testing, inference, and chat interface development. It supports fine-tuning multiple LLMs such as Zephyr, Mistral, Llama-3, Phi, Flan-T5, and user-provided models, with enhanced retrieval capabilities via retrieval-augmented generation (RAG). It also includes an automatic web crawling RAG data scraper and a human evaluation feature for model quality assessment. The system's capabilities are demonstrated through a university information chatbot, with comparative analysis of different LLMs using a benchmark crowd-sourced dataset.
  7. Agricultural Recommendation System based on Multivariate Weather Forecasting Model(Submitted to Engineering Applications of Artificial Intelligence journal) 🌐PRE-PRINT
  8. 📝This paper proposes a context-based crop recommendation system using a weather forecast model to improve farming practices in Bangladesh. The multivariate Stacked Bi-LSTM Network is used for accurate weather prediction, including rainfall, temperature, humidity, and sunshine. The system guides farmers in making informed decisions about planting, irrigation, harvesting, and more. It also alerts farmers about extreme weather conditions and provides knowledge-based crop recommendations for flood and drought-prone areas.
  9. A Suffix Independent Algorithm for Stemming Bangla Words using Finite State Transducer(Submitted to Expert Systems With Applications)
  10. 📝This study proposes and evaluates a suffix-independent stemming algorithm for Bangla language using a finite state transducer (FST)-based framework. The algorithm creates a dictionary of root words implemented as an FST, achieving high speed and no memory usage for vocabulary keeping. A novel stemmer dataset was developed to evaluate the algorithm's performance, resulting in 96.58% detection accuracy and 96.34% stemming accuracy. The proposed scheme outperforms existing methods and demonstrates effectiveness through experiments.
  11. Suggesting Bengali words using Masked Language Model
  12. 📝This study explores Bangla word suggestion in texts using a fine-tuned Bangla-BERT model with Masked Language Modeling (MLM). While transformer-based models like BERT have revolutionized natural language processing (NLP), the availability of task-specific datasets, especially for word-suggesting in Bangla, is limited. This research addresses this gap by employing MLM to predict Bangla words. Sub-word tokenization enhances the model's ability to handle unknown words. Unlike existing pretrained BanglaBERT model that uses word-piece masking, we experiment with dynamic whole-word masking algorithms. Our research aims to analyze the performance of the fine-tuned BanglaBERT model under different masking strategies and compare results to the traditional BanglaBERT approach.
  13. Detecting AI-Generated Assignments in Educational Evaluation: A Transformer-Based Approach
  14. 📝This research work presents a transformer-based model to detect whether an assignment is AI-generated or human-written. The model was trained on a dataset of 5410 assignments, with 2742 being AI-generated and 2668 being human-written. Among the explored transformer-based architectures, DistilBERT provided the highest accuracy of 92%.

Ongoing Research

  1. Comparing Prompt Based and Standard Fine Tuning for Bangla Text Classification

My projects

  1. LLM based QA chatbot builder 🔗Github
  2. 🎯Python, Large Language Model(LLM), Langchain,Gradio

  3. Medical LLM Chat Bot - Chat with pdf using LLM(LlamaV2) langchain and streamlit 🔗Github
  4. 🎯Python, Large Language Model(LLM), Langchain,Steamlit

  5. KUET Chat Bot - Information about KUET🔗Github
  6. 🎯Python,NLP

  7. Efficient Backlog Routine Generator - Building with Python and Flask for Streamlined Task Management 🔗Github
  8. 🎯Python,Flask, First-fit-decreasing (FFD) bin packing algorithm

  9. Secure File Locker - Implementing RSA Encryption Algorithm with Python and Flask for Data Protection 🔗Github
  10. 🎯Security, Python, RSA encryption algorithm

  11. Daily Expense Management on iOS - A User-Centric Mobile Application for Efficient Financial Tracking 🔗Github
  12. 🎯iOS, Xcode, Swift

  13. Anonymity-Preserving Post Web Application - Implementing Python and Flask for Secure and Confidential Content Sharing 🔗Github
  14. 🎯Python, Flask, MongoDB

  15. Windmill Simulation - A Computer Graphics Project Implemented in C++ 🔗Github
  16. 🎯C++,OpenGL

  17. Booklist - A Mobile Application for Academic Booklists at Khulna University of Engineering & Technology (KUET) across Various Departments with PDF Links to the Books 🔗Github
  18. 🎯Android Studio, Java

  19. Statistics exam - Design and Implementation of a Statistics Exam Generation System for Students using Python and Flask with Randomized Data Generation 🔗Github
  20. 🎯Python, Flask

  21. Smart Home Automation with IoT - Integrating NodeMCU (ESP8266)🔗Github
  22. 🎯Python, Flask

  23. Counterfeit note detection - Fake Bangladeshi Banknote Detection using Convolutional Neural Networks (CNN)🔗Github
  24. 🎯Convolutional Neural Networks, Python, Image processing

  25. Hello OS - Simple operating system project using grub and qemu 🔗Github
  26. 🎯C++, Grub, Qemu

  27. Brain Tumor Detection using Convolutional Neural Networks (CNN) - An AI-based Approach for Accurate Diagnosis 🔗Github
  28. 🎯Convolutional Neural Networks, Python, Image processing

  29. Fake Bangladeshi Currency Detection - A Comparative Analysis using OpenCV and MATLAB 🔗Github
  30. 🎯Python, OpenCV, MATLAB

  31. Building a Social Media Website - Harnessing HTML, CSS, JavaScript, ASP.NET, and C# for Dynamic Online Interaction 🔗Github
  32. 🎯Asp.net, C#, HTML,CSS, JavaScript

My skills

My creative skills & experiences.

I have experience in programming using C++ and Python. I have finished several projects, such as a chatbot for university information, websites for evaluating students, a personal voice assistant, and a file locker that uses an encryption algorithm. Recently, I have been working on LLM and langchain. I am trying to retrain and finetuned the LLM models using the PEFT library. I have also completed machine learning projects such as fake note detection, human abnormality detection, brain tumour detection, and an intelligent chatbot. I have also finished some natural language processing projects, like an intelligent chatbot that uses LSTM, Bangla text summarization using a transformer model, Bangla keyboard error correction using a transformer model, a Bangla stemmer design, a Bangla word-to-vector conversion, and a chatbot that uses the seq2seq model. I also have experience in hardware projects such as Home Automation using IoT and an image processing project, Fake Bangladeshi currency detection using OpenCV. Furthermore, I have experience in database management using MySQL.

Read more
Pytorch,Tensorflow
90%
Python
90%
C++
90%
Machine learning, NLP
80%
Image processing and Secruity
60%
Data structure and algorithm
70%
JavaScript
80%
Java,C#,swift
60%
SQL
90%
MySQL,HTML,CSS
70%

Contact me

Get in Touch

For any information ,you can contact with me.

Name
Md. Shahidul Salim
Present Address
CSE, KUET, Khulna, Bangladesh
Permanent Address
Bromottor, Rangunia, Chattogram, Bangladesh
Email
shahidulshakib034@gmail.com
ss@cse.kuet.ac.bd
Social Media