Research Experience
	
		
	
		
		- Comparison Between MFCC and PLP in Automatic Speech Recognition (ASR) of Bangla Speech Corpus
Publications
    
      - 
        Publisher: Przeglad ElektrotechnicznyIn this study, the effectiveness of six machine learning and eight deep learning algorithms in analyzing electroencephalogram (EEG) signals for detecting epileptic seizures has been investigated. The study utilizes 14 channels in the EMOTIV EPOC+ device which is based on international 10-20 system. To find out the most informative and sensitive channel, one of the 14 channels has been dropped one at a time. The accuracy values were determined for all the methods using two different publicly available datasets: the Guinea-Bissau epilepsy dataset and the Nigeria epilepsy dataset. In case of machine learning models, the performance of SVM classifier performs best with maximum accuracy of 83.2% (Guinea-Bissau) and 77% (Nigeria) without excluding any channels. No significant performance degradation has been observed for single channel exclusion of this classifier. Among the deep learning models, the four best performing models in terms of accuracy are CNN-LSTM (92.5%), IC-RNN (91.8%), ChronoNet (91.1%) and C-DRNN (88.6%). After excluding one channel at a time and investigating their effect on the performance of the four DL models, it has been observed that the most significant and most sensitive channels lie within the frontal and parietal zone. This finding will be very useful in practice as it indicates that the electrodes in the frontal and parietal zone should be placed with absolute precision for accurate diagnosis of the diseases. In addition, this study also explore the effectiveness of the selected classifiers in detecting seizure in case of failure of any particular EEG signal channel. 
        
- 
        Publisher: IEEESeizure detection using electroencephalogram (EEG) signals plays a crucial role in diagnosing and treating epilepsy. However, the cost associated with deploying EEG-based seizure detection systems limits their widespread adoption, particularly in resource-constrained healthcare settings like remote areas in countries like Bangladesh. This research aims at fostering cost-efficient, fully automated initial seizure diagnosis in remote areas. This paper presents a novel approach for classifier selection in EEG seizure detection systems to improve cost-effectiveness while maintaining high detection accuracy. Two publicly available datasets of EEG signals from a low-cost consumer-grade EEG headset with 14 channels have been chosen to ensure lower cost during data acquisition along with lower complexity of classifier design. Fourteen state-of-the-art classifiers found in recent literature have been used to distinguish between epileptic individuals and healthy individuals. Firstly, we have chosen the four most accurate classifiers which are ChronoNet (94.6%), CNN-LSTM (92.5%), IC-RNN (91.8%), and C-DRNN (88.6%). Then further compared these four models in terms of evaluation time (ET), and trainable parameters (TP) along with their Cohen's kappa score and discarded the more complex CNN-LSTM model due to its extremely high ET (1.48s) and TP (about 2.2 million). We have also investigated the sensitivity of individual channels on the best-performing classifiers and found that most sensitive channels lie in the frontal and parietal zone. After comparing the robustness due to failure of any EEG signal channel, the C-DRNN has been found the most robust and the fastest for the initial diagnosis with the least resources.
        
- 
        Publisher: MAT JournalsElectroencephalograms (EEGs) are commonly used to diagnose brain conditions such as epilepsy. Deep Learning Algorithms have proven to be 
          effective in analyzing EEG signals for detecting epileptic seizures. In this study, machine learning model: Random Forest Tree (RFT), deep 
          learning models: Convolutional Neural Network (CNN) and ChronoNet, and transformer models: Vision Transformer (ViT) and Swin Transformer (ST) 
          have been proposed to distinguish between epileptic individuals and healthy individuals. In Random Forest Tree, features are chosen randomly, 
          data are trained in different trees and predictions from all trees are combined to get the final prediction. ChronoNet is popular for predicting 
          the future value of time-series-based data.  ViT is a neural network architecture designed for image recognition tasks. Unlike traditional 
          convolutional neural networks (CNNs), which rely on hand-designed feature extraction layers, ViTs use self-attention mechanisms to learn relevant 
          image features directly from raw pixel values. Swin transformer is a vision transformer (ViT) variant but with a hierarchical way of processing 
          the image. ChronoNet has performed best among the models with accuracies of 94% and 88.7% for the Guinea-Bissau and Nigeria datasets respectively. 
          Transformer models have poor accuracies in this study, as only the first 3 channels out of 14 channels of the 10-20 system have been considered to 
          create images from signals because of the scarcity of resources. Compared to the Swin Transformer model, the Vision Transformer architecture has 
          shown better performance in accurately classifying epileptic patients and healthy individuals. Performances of RFT and CNN models have been satisfactory 
          as well. Models have been trained and tested on publicly available data.
- 
        Publisher: MECS Press In this paper, the bit error rate (BER) performance of SFBC-OFDM systems for frequency selective fading channels is observed for various antenna 
          orientations and modulation schemes. The objective is to find out a suitable configuration with minimum number of receiving antenna that requires 
          minimum signal power level at the receiver to provide reliable voice and video communication. We have considered both M-ary phase shift keying (MPSK)
           and M-ary quadrature amplitude modulation (MQAM) in the performance analysis considering both perfect and imperfect channel state information (CSI). 
           The authors have expressed the BER under imperfect channel estimation condition as a function of BER under perfect channel condition in this paper. 
           The finding shows, for a BTS with 4 transmitting antenna and MS with 2 receiving antenna BPSK performs better for both perfect and imperfect CSI. 
           Maximum permissible channel estimation error increases with the usage of more receiving antenna at the expense of increased cost.
- 
        Publisher: MECS Press In this work, a 5 state left to right HMM-based Bangla Isolated word speech recognizer has been developed. To train and test the recognizer, a small corpus of 
          various sampling frequencies have been developed in noisy as well as the noiseless environment. The number of filter banks is varied during the feature extraction 
          phase for both MFCC and PLP. The effects of 2nd and 3rd differential coefficients have also been observed. Experimental results exhibit that MFCC based feature 
          extraction technique is better in CLASSROOM environment on the contrary PLP based technique performs better not only in a noiseless environment but also in when AC 
          or FAN noise is present. We have also noticed that higher sampling frequency and higher filter order don’t always help to improve the performance.
- 
         Publisher: IEEE This paper has observed the effects of different coefficients for MFCC and PLP feature extraction techniques for Bangla corpus System. We have first observed 
          the effects of 12 coefficients for every 10 ms frames, and then added the delta and accelerating coefficients to get 24 and 36 coefficient vectors per frame 
          respectively. Then we have also observed the effect of appending the power coefficient and its first and second derivative while getting a 39 coefficient feature 
          vector per frame. In addition, we have further appended 13 third differential coefficients to make a vector set of 52 coefficients per frame to observe the effect 
          of third differential coefficients too. From the experimental results, we have observed that for gender unbiased models, delta addition has shown the maximum 
          detection both for speaker dependent and independent system. But for speaker independent gender biased models, acceleration, power, and third differential 
          coefficients addition have increased the detection for both MFCC and PLP in noise-free audio samples with the sampling rate of 44.1 KHz.
- 
         Publisher: IEEE The paper has observed that different environmental noises and sampling frequencies severely affect the performance of the MFCC and PLP based Bangla Isolated Word 
          Recognition System. We have observed the effects of different environments on MFCC and PLP for 39 and 52 coefficients for 8 kHz, 16 kHz, 32 kHz, and 44.1 kHz sampling 
          frequencies. From the experimental results, we have observed that for different sampling frequency both in noiseless and noisy medium PLP models detect better than MFCC
          except in CLASSROOM environment where different types of noise are present simultaneously.
- 
         Publisher: ELSEVIER In terms of mechanical flexibility, organic SRAM offers better designs and a commercially feasible option with the ability to deliver acceptable performance. This paper 
          investigates the implementation of different SRAM topologies based on organic thin film transistors (OTFTs). In this work, a compact spice model is used to simulate pOTFT 
          and nOTFT in LTSpice software. Time delays, power consumption, the power delay product (PDP), and static noise margin (SNM) for read and write operations are calculated, 
          and a comparative analysis of OTFT based 6T, 7T, 8T, and 9T SRAM topologies is performed. Among different topologies, 9T OTFT SRAM cell achieves a 1.67× increase in SNM, 
          compared to conventional 6T OTFT-based SRAM cell. The highest figure of merit value of 9T SRAM cell indicates its suitability for various applications.
- 
         Publisher: World ScientificThis paper presents a performance analysis of indium-gallium-zinc-oxide (IGZO)- and pentacene-based top-gate-top-contact (TGTC) and bottom-gate-top-contact (BGTC) thin 
          film transistors (TFTs). Extensive simulation has been performed to assess the performances in terms of threshold voltage, subthreshold slope, on-off current ratio, 
          mobility, and figure of merit (FoM). Results indicate a trade-off between mobility and current ratio with respect to the permittivity of the dielectric layer, where 
          tantalum oxide (Ta2O5) provides the optimum result in terms of FoM. The mobility of IGZO is significantly higher for both structures, whereas the current ratio for IGZO is 
          higher than pentacene in the BGTC configuration. Comparing the structural configurations, Ta2O5-IGZO-based BGTC achieves 5.92× and 41.8× better mobility and current ratio, 
          respectively, over TGTC structures. The threshold voltage of IGZO-based TFT is observed to increase with the permittivity of the dielectric in TGTC configuration but 
          decrease in BGTC configuration. Meanwhile, the increase in oxide and active layer thicknesses causes a decrease in the threshold voltage. Moreover, both mobility and 
          current ratio improve with a decrease in oxide or active layer thickness. Maximum mobility of 32.30cm2/Vs and a maximum current ratio of 7.54E+08 are achieved for 
          Ta2O5-IGZO-based BGTC TFT with 10μm channel thickness and 5μm oxide thickness.