Journal Description
Big Data and Cognitive Computing
Big Data and Cognitive Computing
is an international, peer-reviewed, open access journal on big data and cognitive computing published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: JCR - Q1 (Computer Science, Theory and Methods) / CiteScore - Q1 (Management Information Systems)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18 days after submission; acceptance to publication is undertaken in 4.5 days (median values for papers published in this journal in the first half of 2024).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
3.7 (2023)
Latest Articles
Digital Eye-Movement Outcomes (DEMOs) as Biomarkers for Neurological Conditions: A Narrative Review
Big Data Cogn. Comput. 2024, 8(12), 198; https://doi.org/10.3390/bdcc8120198 - 19 Dec 2024
Abstract
►
Show Figures
Eye-movement assessment is a key component of neurological evaluation, offering valuable insights into neural deficits and underlying mechanisms. This narrative review explores the emerging subject of digital eye-movement outcomes (DEMOs) and their potential as sensitive biomarkers for neurological impairment. Eye tracking has become
[...] Read more.
Eye-movement assessment is a key component of neurological evaluation, offering valuable insights into neural deficits and underlying mechanisms. This narrative review explores the emerging subject of digital eye-movement outcomes (DEMOs) and their potential as sensitive biomarkers for neurological impairment. Eye tracking has become a useful method for investigating visual system functioning, attentional processes, and cognitive mechanisms. Abnormalities in eye movements, such as altered saccadic patterns or impaired smooth pursuit, can act as important diagnostic indicators for various neurological conditions. The non-invasive nature, cost-effectiveness, and ease of implementation of modern eye-tracking systems makes it particularly attractive in both clinical and research settings. Advanced digital eye-tracking technologies and analytical methods enable precise quantification of eye-movement parameters, complementing subjective clinical evaluations with objective data. This review examines how DEMOs could contribute to the localisation and diagnosis of neural impairments, potentially serving as useful biomarkers. By comprehensively exploring the role of eye-movement assessment, this review aims to highlight the common eye-movement deficits seen in neurological injury and disease by using the examples of mild traumatic brain injury and Parkinson’s Disease. This review also aims to enhance the understanding of the potential use of DEMOs in diagnosis, monitoring, and management of neurological disorders, ultimately improving patient care and deepening our understanding of complex neurological processes. Furthermore, we consider the broader implications of this technology in unravelling the complexities of visual processing, attention mechanisms, and cognitive functions. This review summarises how DEMOs could reshape our understanding of brain health and allow for more targeted and effective neurological interventions.
Full article
Open AccessArticle
Forecasting Human Core and Skin Temperatures: A Long-Term Series Approach
by
Xinge Han, Jiansong Wu, Zhuqiang Hu, Chuan Li and Boyang Sun
Big Data Cogn. Comput. 2024, 8(12), 197; https://doi.org/10.3390/bdcc8120197 - 19 Dec 2024
Abstract
►▼
Show Figures
Human core and skin temperature (Tcr and Tsk) are crucial indicators of human health and are commonly utilized in diagnosing various types of diseases. This study presents a deep learning model that combines a long-term series forecasting method with transfer
[...] Read more.
Human core and skin temperature (Tcr and Tsk) are crucial indicators of human health and are commonly utilized in diagnosing various types of diseases. This study presents a deep learning model that combines a long-term series forecasting method with transfer learning techniques, capable of making precise, personalized predictions of Tcr and Tsk in high-temperature environments with only a small corpus of actual training data. To practically validate the model, field experiments were conducted in complex environments, and a thorough analysis of the effects of three diverse training strategies on the overall performance of the model was performed. The comparative analysis revealed that the optimized training method significantly improved prediction accuracy for forecasts extending up to 10 min into the future. Specifically, the approach of pretraining the model on in-distribution samples followed by fine-tuning markedly outperformed other methods in terms of prediction accuracy, with a prediction error for Tcr within ±0.14 °C and Tsk, mean within ±0.46 °C. This study provides a viable approach for the precise, real-time prediction of Tcr and Tsk, offering substantial support for advancing early warning research of human thermal health.
Full article
Figure 1
Open AccessArticle
Arabic Opinion Classification of Customer Service Conversations Using Data Augmentation and Artificial Intelligence
by
Rihab Fahd Al-Mutawa and Arwa Yousuf Al-Aama
Big Data Cogn. Comput. 2024, 8(12), 196; https://doi.org/10.3390/bdcc8120196 - 19 Dec 2024
Abstract
►▼
Show Figures
Customer satisfaction is not just a significant factor but a cornerstone for smart cities and their organizations that offer services to people. It enhances the organization’s reputation and profitability and drastically raises the chances of returning customers. Unfortunately, customer support service through online
[...] Read more.
Customer satisfaction is not just a significant factor but a cornerstone for smart cities and their organizations that offer services to people. It enhances the organization’s reputation and profitability and drastically raises the chances of returning customers. Unfortunately, customer support service through online chat is often not rated by customers to help improve the service. This study employs artificial intelligence and data augmentation to predict customer satisfaction ratings from conversations by analyzing the responses of customers and service providers. For the study, the authors obtained actual conversations between customers and real agents from the call center database of Jeddah Municipality that were rated by customers on a scale of 1–5. They trained and tested five prediction models with approaches based on logistic regression, random forest, and ensemble-based deep learning, and fine-tuned two pre-trained recent models: ArabicT5 and SaudiBERT. Then, they repeated training and testing models after applying a data augmentation technique using the generative artificial intelligence, GPT-4, to improve the unbalance in customer conversation data. The study found that the ensemble-based deep learning approach best predicts the five-, three-, and two-class classifications. Moreover, data augmentation improved accuracy using the ensemble-based deep learning model with a 1.69% increase and the logistic regression model with a 3.84% increase. This study contributes to the advancement of Arabic opinion mining, as it is the first to report the performance of determining customer satisfaction levels using Arabic conversation data. The implications of this study are significant, as the findings can be applied to improve customer service in various organizations.
Full article
Figure 1
Open AccessArticle
Mandarin Recognition Based on Self-Attention Mechanism with Deep Convolutional Neural Network (DCNN)-Gated Recurrent Unit (GRU)
by
Xun Chen, Chengqi Wang, Chao Hu and Qin Wang
Big Data Cogn. Comput. 2024, 8(12), 195; https://doi.org/10.3390/bdcc8120195 - 18 Dec 2024
Abstract
►▼
Show Figures
Speech recognition technology is an important branch in the field of artificial intelligence, aiming to transform human speech into computer-readable text information. However, speech recognition technology still faces many challenges, such as noise interference, and accent and speech rate differences. An aim of
[...] Read more.
Speech recognition technology is an important branch in the field of artificial intelligence, aiming to transform human speech into computer-readable text information. However, speech recognition technology still faces many challenges, such as noise interference, and accent and speech rate differences. An aim of this paper is to explore a deep learning-based speech recognition method to improve the accuracy and robustness of speech recognition. Firstly, this paper introduces the basic principles of speech recognition and existing mainstream technologies, and then focuses on the deep learning-based speech recognition method. Through comparative experiments, it is found that the self-attention mechanism performs best in speech recognition tasks. In order to further improve speech recognition performance, this paper proposes a deep learning model based on the self-attention mechanism with DCNN-GRU. The model realizes the dynamic attention to an input speech by introducing the self-attention mechanism in a neural network model instead of an RNN and with a deep convolutional neural network, which improves the robustness and recognition accuracy of this model. This experiment uses 170 h of Chinese dataset AISHELL-1. Compared with the deep convolutional neural network, the deep learning model based on the self-attention mechanism with DCNN-GRU accomplishes a reduction of at least 6% in CER. Compared with a bidirectional gated recurrent neural network, the deep learning model based on the self-attention mechanism with DCNN-GRU accomplishes a reduction of 0.7% in CER. And finally, this experiment is performed on a test set analyzed the influencing factors affecting the CER. The experimental results show that this model exhibits good performance in various noise environments and accent conditions.
Full article
Figure 1
Open AccessArticle
Assessing the Guidelines on the Use of Generative Artificial Intelligence Tools in Universities: A Survey of the World’s Top 50 Universities
by
Midrar Ullah, Salman Bin Naeem and Maged N. Kamel Boulos
Big Data Cogn. Comput. 2024, 8(12), 194; https://doi.org/10.3390/bdcc8120194 - 18 Dec 2024
Abstract
►▼
Show Figures
The widespread adoption of Generative Artificial Intelligence (GenAI) tools in higher education has necessitated the development of appropriate and ethical usage guidelines. This study aims to explore and assess publicly available guidelines covering the use of GenAI tools in universities, following a predefined
[...] Read more.
The widespread adoption of Generative Artificial Intelligence (GenAI) tools in higher education has necessitated the development of appropriate and ethical usage guidelines. This study aims to explore and assess publicly available guidelines covering the use of GenAI tools in universities, following a predefined checklist. We searched and downloaded publicly accessible guidelines on the use of GenAI tools from the websites of the top 50 universities globally, according to the 2025 QS university rankings. From the literature on GenAI use guidelines, we created a 24-item checklist, which was then reviewed by a panel of experts. This checklist was used to assess the characteristics of the retrieved university guidelines. Out of the 50 university websites explored, guidelines were publicly accessible on the sites of 41 institutions. All these guidelines allowed for the use of GenAI tools in academic settings provided that specific instructions detailed in the guidelines were followed. These instructions encompassed securing instructor consent before utilization, identifying appropriate and inappropriate instances for deployment, employing suitable strategies in classroom settings and assessment, appropriately integrating results, acknowledging and crediting GenAI tools, and adhering to data privacy and security measures. However, our study found that only a small number of the retrieved guidelines offered instructions on the AI algorithm (understanding how it works), the documentation of prompts and outputs, AI detection tools, and mechanisms for reporting misconduct. Higher education institutions should develop comprehensive guidelines and policies for the responsible use of GenAI tools. These guidelines must be frequently updated to stay in line with the fast-paced evolution of AI technologies and their applications within the academic sphere.
Full article
Figure 1
Open AccessArticle
EIF-SlideWindow: Enhancing Simultaneous Localization and Mapping Efficiency and Accuracy with a Fixed-Size Dynamic Information Matrix
by
Javier Lamar Léon, Pedro Salgueiro, Teresa Gonçalves and Luis Rato
Big Data Cogn. Comput. 2024, 8(12), 193; https://doi.org/10.3390/bdcc8120193 - 17 Dec 2024
Abstract
►▼
Show Figures
This paper introduces EIF-SlideWindow, a novel enhancement of the Extended Information Filter (EIF) algorithm for Simultaneous Localization and Mapping (SLAM). Traditional EIF-SLAM, while effective in many scenarios, struggles with inaccuracies in highly non-linear systems or environments characterized by significant non-Gaussian noise. Moreover, the
[...] Read more.
This paper introduces EIF-SlideWindow, a novel enhancement of the Extended Information Filter (EIF) algorithm for Simultaneous Localization and Mapping (SLAM). Traditional EIF-SLAM, while effective in many scenarios, struggles with inaccuracies in highly non-linear systems or environments characterized by significant non-Gaussian noise. Moreover, the computational complexity of EIF/EKF-SLAM scales with the size of the environment, often resulting in performance bottlenecks. Our proposed EIF-SlideWindow approach addresses these limitations by maintaining a fixed-size information matrix and vector, ensuring constant-time processing per robot step, regardless of trajectory length. This is achieved through a sliding window mechanism centered on the robot’s pose, where older landmarks are systematically replaced by newer ones. We assess the effectiveness of EIF-SlideWindow using simulated data and demonstrate that it outperforms standard EIF/EKF-SLAM in both accuracy and efficiency. Additionally, our implementation leverages PyTorch for matrix operations, enabling efficient execution on both CPU and GPU. Additionally, the code for this approach is made available for further exploration and development.
Full article
Figure 1
Open AccessArticle
Integrating Statistical Methods and Machine Learning Techniques to Analyze and Classify COVID-19 Symptom Severity
by
Yaqeen Raddad, Ahmad Hasasneh, Obada Abdallah, Camil Rishmawi and Nouar Qutob
Big Data Cogn. Comput. 2024, 8(12), 192; https://doi.org/10.3390/bdcc8120192 - 16 Dec 2024
Abstract
►▼
Show Figures
Background/Objectives: The COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), led to significant global health challenges, including the urgent need for accurate symptom severity prediction aimed at optimizing treatment. While machine learning (ML) and deep learning (DL) models have
[...] Read more.
Background/Objectives: The COVID-19 pandemic, caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), led to significant global health challenges, including the urgent need for accurate symptom severity prediction aimed at optimizing treatment. While machine learning (ML) and deep learning (DL) models have shown promise in predicting COVID-19 severity using imaging and clinical data, there is limited research utilizing comprehensive tabular symptom datasets. This study aims to address this gap by leveraging a detailed symptom dataset to develop robust models for categorizing COVID-19 symptom severity, thereby enhancing clinical decision making. Methods: A unique tabular dataset was created using questionnaire responses from 5654 individuals, including demographic information, comorbidities, travel history, and medical data. Both unsupervised and supervised ML techniques were employed, including k-means clustering to categorize symptom severity into mild, moderate, and severe clusters. In addition, classification models, namely, Support Vector Machine (SVM), Adaptive Boosting (AdaBoost), eXtreme Gradient Boosting (XGBoost), random forest, and a deep neural network (DNN) were used to predict symptom severity levels. Feature importance was analyzed using the random forest model for its robustness with high-dimensional data and ability to capture complex non-linear relationships, and statistical significance was evaluated through ANOVA and Chi-square tests. Results: Our study showed that fatigue, joint pain, and headache were the most important features in predicting severity. SVM, AdaBoost, and random forest achieved an accuracy of 94%, while XGBoost achieved an accuracy of 96%. DNN showed robust performance in handling complex patterns with 98% accuracy. In terms of precision and recall metrics, both the XGBoost and DNN models demonstrated robust performance, particularly for the moderate class. XGBoost recorded 98% precision and 97% recall, while DNN achieved 99% precision and recall. The clustering approach improved classification accuracy by reducing noise and dimensionality. Statistical tests confirmed the significance of additional features like Body Mass Index (BMI), age, and dominant variant type. Conclusions: Integrating symptom data with advanced ML models offers a promising approach for accurate COVID-19 severity classification. This method provides a reliable tool for healthcare professionals to optimize patient care and resource management, particularly in managing COVID-19 and potential future pandemics. Future work should focus on incorporating imaging and clinical data to further enhance model accuracy and clinical applicability.
Full article
Figure 1
Open AccessArticle
Re-Evaluating Deep Learning Attacks and Defenses in Cybersecurity Systems
by
Meaad Ahmed, Qutaiba Alasad, Jiann-Shiun Yuan and Mohammed Alawad
Big Data Cogn. Comput. 2024, 8(12), 191; https://doi.org/10.3390/bdcc8120191 - 16 Dec 2024
Abstract
►▼
Show Figures
Cybersecurity attacks pose a significant threat to the security of network systems through intrusions and illegal communications. Measuring the vulnerability of cybersecurity is crucial for refining the overall system security to further mitigate potential security risks. Machine learning (ML)-based intrusion detection systems (IDSs)
[...] Read more.
Cybersecurity attacks pose a significant threat to the security of network systems through intrusions and illegal communications. Measuring the vulnerability of cybersecurity is crucial for refining the overall system security to further mitigate potential security risks. Machine learning (ML)-based intrusion detection systems (IDSs) are mainly designed to detect malicious network traffic. Unfortunately, ML models have recently been demonstrated to be vulnerable to adversarial perturbation, and therefore enable potential attackers to crash the system during normal operation. Among different attacks, generative adversarial networks (GANs) have been known as one of the most powerful threats to cybersecurity systems. To address these concerns, it is important to explore new defense methods and understand the nature of different types of attacks. In this paper, we investigate four serious attacks, GAN, Zeroth-Order Optimization (ZOO), kernel density estimation (KDE), and DeepFool attacks, on cybersecurity. Deep analysis was conducted on these attacks using three different cybersecurity datasets, ADFA-LD, CSE-CICIDS2018, and CSE-CICIDS2019. Our results have shown that KDE and DeepFool attacks are stronger than GANs in terms of attack success rate and impact on system performance. To demonstrate the effectiveness of our approach, we develop a defensive model using adversarial training where the DeepFool method is used to generate adversarial examples. The model is evaluated against GAN, ZOO, KDE, and DeepFool attacks to assess the level of system protection against adversarial perturbations. The experiment was conducted by leveraging a deep learning model as a classifier with the three aforementioned datasets. The results indicate that the proposed defensive model refines the resilience of the system and mitigates the presented serious attacks.
Full article
Figure 1
Open AccessArticle
Comparative Study of Filtering Methods for Scientific Research Article Recommendations
by
Driss El Alaoui, Jamal Riffi, Abdelouahed Sabri, Badraddine Aghoutane, Ali Yahyaouy and Hamid Tairi
Big Data Cogn. Comput. 2024, 8(12), 190; https://doi.org/10.3390/bdcc8120190 - 16 Dec 2024
Abstract
►▼
Show Figures
Given the daily influx of scientific publications, researchers often face challenges in identifying relevant content amid the vast volume of available information, typically resorting to conventional methods like keyword searches or manual browsing. Utilizing a dataset comprising 1895 users and 3122 articles from
[...] Read more.
Given the daily influx of scientific publications, researchers often face challenges in identifying relevant content amid the vast volume of available information, typically resorting to conventional methods like keyword searches or manual browsing. Utilizing a dataset comprising 1895 users and 3122 articles from the CI&T Deskdrop collection, as well as 7947 users and 25,975 articles from CiteULike-t, we examine the effectiveness of collaborative filtering and content-based and hybrid recommendation approaches in scientific literature recommendations. These methods automatically generate article suggestions by analyzing user preferences and historical behavior. Our findings, evaluated based on accuracy (Precision@K), ranking quality (NDCG@K), and novelty, reveal that the hybrid approach significantly outperforms other methods, tackling some challenges such as cold starts and sparsity problems. This research offers theoretical insights into recommendation model effectiveness and practical implications for developing tools that enhance content discovery and researcher productivity.
Full article
Figure 1
Open AccessArticle
An Intelligent Self-Validated Sensor System Using Neural Network Technologies and Fuzzy Logic Under Operating Implementation Conditions
by
Serhii Vladov, Victoria Vysotska, Valerii Sokurenko, Oleksandr Muzychuk and Lyubomyr Chyrun
Big Data Cogn. Comput. 2024, 8(12), 189; https://doi.org/10.3390/bdcc8120189 - 13 Dec 2024
Abstract
►▼
Show Figures
This article presents an intelligent self-validated sensor system developed for dynamic objects and based on the intelligent sensor concept, which ensures autonomous data collection and real-time analysis while adapting to changing conditions and compensating for errors. The research’s scientific merit is that an
[...] Read more.
This article presents an intelligent self-validated sensor system developed for dynamic objects and based on the intelligent sensor concept, which ensures autonomous data collection and real-time analysis while adapting to changing conditions and compensating for errors. The research’s scientific merit is that an intelligent self-validated sensor for dynamic objects has been developed that integrates adaptive correction algorithms, fuzzy logic, and neural networks to improve the sensors’ accuracy and reliability under changing operating conditions. The proposed intelligent self-validated sensor system provides real-time error compensation, long-term stability, and effective fault diagnostics. Analytical equations are described, considering corrections related to influencing factors, temporal drift, and calibration characteristics, significantly enhancing measurement accuracy and reliability. The fuzzy logic application allows for refining the scaling coefficient that adjusts the relationship between the measured parameter and influencing factors, utilizing fuzzy inference algorithms. Additionally, monitoring and diagnostics implementation for sensor states through LSTM networks enable effective fault detection. Computational experiments on the TV3-117 engine demonstrated high data-restoring accuracy during forced interruptions, reaching 99.5%. A comparative analysis with alternative approaches confirmed the advantages of using LSTM (Long Short-Term Memory) neural networks in improving measurement quality.
Full article
Figure 1
Open AccessArticle
Integrating Generative AI in Hackathons: Opportunities, Challenges, and Educational Implications
by
Ramteja Sajja, Carlos Erazo Ramirez, Zhouyayan Li, Bekir Z. Demiray, Yusuf Sermet and Ibrahim Demir
Big Data Cogn. Comput. 2024, 8(12), 188; https://doi.org/10.3390/bdcc8120188 - 13 Dec 2024
Abstract
►▼
Show Figures
Hackathons have become essential in the software industry, fostering innovation and skill development for both organizations and students. These events facilitate rapid prototyping for companies while providing students with hands-on learning opportunities that bridge theory and practice. Over time, hackathons have evolved from
[...] Read more.
Hackathons have become essential in the software industry, fostering innovation and skill development for both organizations and students. These events facilitate rapid prototyping for companies while providing students with hands-on learning opportunities that bridge theory and practice. Over time, hackathons have evolved from competitive arenas into dynamic educational platforms, promoting collaboration between academia and industry. The integration of artificial intelligence (AI) and machine learning is transforming hackathons, enhancing learning experiences, and introducing ethical considerations. This study examines the impact of generative AI tools on technological decision-making during the 2023 University of Iowa Hackathon. It analyzes how AI influences project efficiency, learning outcomes, and collaboration, while addressing the ethical challenges posed by its use. The findings offer actionable insights and strategies for effectively integrating AI into future hackathons, balancing innovation, ethics, and educational value.
Full article
Figure 1
Open AccessSystematic Review
Predictive Models for Educational Purposes: A Systematic Review
by
Ahlam Almalawi, Ben Soh, Alice Li and Halima Samra
Big Data Cogn. Comput. 2024, 8(12), 187; https://doi.org/10.3390/bdcc8120187 - 13 Dec 2024
Abstract
►▼
Show Figures
This systematic literature review evaluates predictive models in education, focusing on their role in forecasting student performance, identifying at-risk students, and personalising learning experiences. The review compares the effectiveness of machine learning (ML) algorithms such as Support Vector Machines (SVMs), Artificial Neural Networks
[...] Read more.
This systematic literature review evaluates predictive models in education, focusing on their role in forecasting student performance, identifying at-risk students, and personalising learning experiences. The review compares the effectiveness of machine learning (ML) algorithms such as Support Vector Machines (SVMs), Artificial Neural Networks (ANNs), and Decision Trees with traditional statistical models, assessing their ability to manage complex educational data and improve decision-making. The search, conducted across databases including ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, yielded 400 records. After screening and removing duplicates, 124 studies were included in the final review. The findings show that ML algorithms consistently outperform traditional models due to their capacity to handle large, non-linear datasets and continuously enhance predictive accuracy as new patterns emerge. These models effectively incorporate socio-economic, demographic, and academic data, making them valuable tools for improving student retention and performance. However, the review also identifies key challenges, including the risk of perpetuating biases present in historical data, issues of transparency, and the complexity of interpreting AI-driven decisions. In addition, reliance on varying data processing methods across studies reduces the generalisability of current models. Future research should focus on developing more transparent, interpretable, and equitable models while standardising data collection and incorporating non-traditional variables, such as cognitive and motivational factors. Ensuring transparency and ethical standards in handling student data is essential for fostering trust in AI-driven models.
Full article
Figure 1
Open AccessArticle
An Analysis of Vaccine-Related Sentiments on Twitter (X) from Development to Deployment of COVID-19 Vaccines
by
Rohitash Chandra, Jayesh Sonawane and Jahnavi Lande
Big Data Cogn. Comput. 2024, 8(12), 186; https://doi.org/10.3390/bdcc8120186 - 13 Dec 2024
Abstract
Anti-vaccine sentiments have been well-known and reported throughout the history of viral outbreaks and vaccination programmes. The COVID-19 pandemic caused fear and uncertainty about vaccines, which has been well expressed on social media platforms such as Twitter (X). We analyse sentiments from the
[...] Read more.
Anti-vaccine sentiments have been well-known and reported throughout the history of viral outbreaks and vaccination programmes. The COVID-19 pandemic caused fear and uncertainty about vaccines, which has been well expressed on social media platforms such as Twitter (X). We analyse sentiments from the beginning of the COVID-19 pandemic and study the public behaviour on X during the planning, development, and deployment of vaccines expressed in tweets worldwide using a sentiment analysis framework via deep learning models. We provide visualisation and analysis of anti-vaccine sentiments throughout the COVID-19 pandemic. We review the nature of the sentiments expressed with the number of tweets and monthly COVID-19 infections. Our results show a link between the number of tweets, the number of cases, and the change in sentiment polarity scores during major waves of COVID-19. We also find that the first half of the pandemic had drastic changes in the sentiment polarity scores that later stabilised, implying that the vaccine rollout impacted the nature of discussions on social media.
Full article
(This article belongs to the Special Issue Application of Semantic Technologies in Intelligent Environment)
►▼
Show Figures
Figure 1
Open AccessArticle
From Fact Drafts to Operational Systems: Semantic Search in Legal Decisions Using Fact Drafts
by
Gergely Márk Csányi, Dorina Lakatos, István Üveges, Andrea Megyeri , János Pál Vadász, Dániel Nagy and Renátó Vági
Big Data Cogn. Comput. 2024, 8(12), 185; https://doi.org/10.3390/bdcc8120185 - 10 Dec 2024
Abstract
►▼
Show Figures
This research paper presents findings from an investigation in the semantic similarity search task within the legal domain, using a corpus of 1172 Hungarian court decisions. The study establishes the groundwork for an operational semantic similarity search system designed to identify cases with
[...] Read more.
This research paper presents findings from an investigation in the semantic similarity search task within the legal domain, using a corpus of 1172 Hungarian court decisions. The study establishes the groundwork for an operational semantic similarity search system designed to identify cases with comparable facts using preliminary legal fact drafts. Evaluating such systems often poses significant challenges, given the need for thorough document checks, which can be costly and limit evaluation reusability. To address this, the study employs manually created fact drafts for legal cases, enabling reliable ranking of original cases within retrieved documents and quantitative comparison of various vectorization methods. The study compares twelve different text embedding solutions (the most recent became available just a few weeks before the manuscript was written) identifying Cohere’s embed-multilingual-v3.0, Beijing Academy of Artificial Intelligence’s bge-m3, Jina AI’s jina-embeddings-v3, OpenAI’s text-embedding-3-large, and Microsoft’s multilingual-e5-large models as top performers. To overcome the transformer-based models’ context window limitation, we investigated chunking, striding, and last chunk scaling techniques, with last chunk scaling significantly improving embedding quality. The results suggest that the effectiveness of striding varies based on token count. Notably, employing striding with 16 tokens yielded optimal results, representing 3.125% of the context window size for the best-performing models. Results also suggested that from the models having 8192 token long context window the bge-m3 model is superior compared to jina-embeddings-v3 and text-embedding-3-large models in capturing the relevant parts of a document if the text contains significant amount of noise. The validity of the approach was evaluated and confirmed by legal experts. These insights led to an operational semantic search system for a prominent legal content provider.
Full article
Figure 1
Open AccessArticle
The Use of Eye-Tracking to Explore the Relationship Between Consumers’ Gaze Behaviour and Their Choice Process
by
Maria-Jesus Agost and Vicente Bayarri-Porcar
Big Data Cogn. Comput. 2024, 8(12), 184; https://doi.org/10.3390/bdcc8120184 - 9 Dec 2024
Abstract
►▼
Show Figures
Eye-tracking technology can assist researchers in understanding motivational decision-making and choice processes by analysing consumers’ gaze behaviour. Previous studies showed that attention is related to decision, as the preferred stimulus is generally the most observed and the last visited before a decision is
[...] Read more.
Eye-tracking technology can assist researchers in understanding motivational decision-making and choice processes by analysing consumers’ gaze behaviour. Previous studies showed that attention is related to decision, as the preferred stimulus is generally the most observed and the last visited before a decision is made. In this work, the relationship between gaze behaviour and decision-making was explored using eye-tracking technology. Images of six wardrobes incorporating different sustainable design strategies were presented to 57 subjects, who were tasked with selecting the wardrobe they intended to keep the longest. The amount of time spent looking was higher when it was the chosen version. Detailed analyses of gaze plots and heat maps derived from eye-tracking records were employed to identify different patterns of gaze behaviour during the selection process. These patterns included alternating attention between a few versions or comparing them against a reference, allowing the identification of stimuli that initially piqued interest but were ultimately not chosen, as well as potential doubts in the decision-making process. These findings suggest that doubts that arise before making a selection warrant further investigation. By identifying stimuli that attract attention but are not chosen, this study provides valuable insights into consumer behaviour and decision-making processes.
Full article
Figure 1
Open AccessArticle
eFC-Evolving Fuzzy Classifier with Incremental Clustering Algorithm Based on Samples Mean Value
by
Emmanuel Tavares, Gray Farias Moita and Alisson Marques Silva
Big Data Cogn. Comput. 2024, 8(12), 183; https://doi.org/10.3390/bdcc8120183 - 6 Dec 2024
Abstract
►▼
Show Figures
This paper introduces a new multiclass classifier called the evolving Fuzzy Classifier (eFC). Starting its knowledge base from scratch, the eFC structure evolves based on a clustering algorithm that can add, merge, delete, or update clusters (= rules) simultaneously while providing class predictions.
[...] Read more.
This paper introduces a new multiclass classifier called the evolving Fuzzy Classifier (eFC). Starting its knowledge base from scratch, the eFC structure evolves based on a clustering algorithm that can add, merge, delete, or update clusters (= rules) simultaneously while providing class predictions. The procedure to add clusters uses the procrastination idea to prevent outliers from affecting the quality of learning. Two pruning mechanisms are used to maintain a concise and compact structure. In the first, redundant clusters are merged based on a similarity measure, and in the second, obsolete and unrepresentative clusters are excluded based on an inactivity strategy. The center of the clusters is adjusted based on the mean value of the attributes. The eFC model was evaluated and compared with state-of-the-art evolving fuzzy systems on 8 randomly selected data streams from the UCI and Kaggle repositories. The experimental results indicate that the eFC outperforms or is at least comparable to alternative state-of-the-art models. Specifically, the eFC achieved an average accuracy of 7% to 37% higher than the competing classifiers. The results and comparisons demonstrate that the eFC is a promising alternative for classification tasks in non-stationary environments, offering good accuracy, a compact structure, low computational cost, and efficient processing time.
Full article
Figure 1
Open AccessArticle
A Centrality-Weighted Bidirectional Encoder Representation from Transformers Model for Enhanced Sequence Labeling in Key Phrase Extraction from Scientific Texts
by
Tsitsi Zengeya, Jean Vincent Fonou Dombeu and Mandlenkosi Gwetu
Big Data Cogn. Comput. 2024, 8(12), 182; https://doi.org/10.3390/bdcc8120182 - 4 Dec 2024
Abstract
Deep learning approaches, utilizing Bidirectional Encoder Representation from Transformers (BERT) and advanced fine-tuning techniques, have achieved state-of-the-art accuracies in the domain of term extraction from texts. However, BERT presents some limitations in that it primarily captures the semantic context relative to the surrounding
[...] Read more.
Deep learning approaches, utilizing Bidirectional Encoder Representation from Transformers (BERT) and advanced fine-tuning techniques, have achieved state-of-the-art accuracies in the domain of term extraction from texts. However, BERT presents some limitations in that it primarily captures the semantic context relative to the surrounding text without considering how relevant or central a token is to the overall document content. There has also been research on the application of sequence labeling on contextualized embeddings; however, the existing methods often rely solely on local context for extracting key phrases from texts. To address these limitations, this study proposes a centrality-weighted BERT model for key phrase extraction from text using sequence labelling (CenBERT-SEQ). The proposed CenBERT-SEQ model utilizes BERT to represent terms with various contextual embedding architectures, and introduces a centrality-weighting layer that integrates document-level context into BERT. This layer leverages document embeddings to influence the importance of each term based on its relevance to the entire document. Finally, a linear classifier layer is employed to model the dependencies between the outputs, thereby enhancing the accuracy of the CenBERT-SEQ model. The proposed CenBERT-SEQ model was evaluated against the standard BERT base-uncased model using three Computer Science article datasets, namely, SemEval-2010, WWW, and KDD. The experimental results show that, although the CenBERT-SEQ and BERT-base models achieved higher and close comparable accuracy, the proposed CenBERT-SEQ model achieved higher precision, recall, and F1-score than the BERT-base model. Furthermore, a comparison of the proposed CenBERT-SEQ model to that of related studies revealed that the proposed CenBERT-SEQ model achieved a higher accuracy, precision, recall, and F1-score of 95%, 97%, 91%, and 94%, respectively, than related studies, showing the superior capabilities of the CenBERT-SEQ model in keyphrase extraction from scientific documents.
Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
►▼
Show Figures
Figure 1
Open AccessArticle
Suspension Parameter Estimation Method for Heavy-Duty Freight Trains Based on Deep Learning
by
Changfan Zhang, Yuxuan Wang and Jing He
Big Data Cogn. Comput. 2024, 8(12), 181; https://doi.org/10.3390/bdcc8120181 - 4 Dec 2024
Abstract
The suspension parameters of heavy-duty freight trains can deviate from their initial design values due to material aging and performance degradation. While traditional multibody dynamics simulation models are usually designed for fixed working conditions, it is difficult for them to adequately analyze the
[...] Read more.
The suspension parameters of heavy-duty freight trains can deviate from their initial design values due to material aging and performance degradation. While traditional multibody dynamics simulation models are usually designed for fixed working conditions, it is difficult for them to adequately analyze the safety status of the vehicle–line system in actual operation. To address this issue, this research provides a suspension parameter estimation technique based on CNN-GRU. Firstly, a prototype C80 train was utilized to build a simulation model for multibody dynamics. Secondly, six key suspension parameters for wheel–rail force were selected using the Sobol global sensitivity analysis method. Then, a CNN-GRU proxy model was constructed, with the actually measured wheel–rail forces as a reference. By combining this approach with NSGA-II (Non-dominated Sorting Genetic Algorithm II), the key suspension parameters were calculated. Finally, the estimated parameter values were applied into the vehicle–line coupled multibody dynamical model and validated. The results show that, with the corrected dynamical model, the relative errors of the simulated wheel–rail force are reduced from 9.28%, 6.24% and 18.11% to 7%, 4.52% and 10.44%, corresponding to straight, curve, and long and steep uphill conditions, respectively. The wheel–rail force simulation’s precision is increased, indicating that the proposed method is effective in estimating the suspension parameters for heavy-duty freight trains.
Full article
(This article belongs to the Special Issue Perception and Detection of Intelligent Vision)
►▼
Show Figures
Figure 1
Open AccessArticle
Patient Satisfaction with the Mawiidi Hospital Appointment Scheduling Application: Insights from the Information Systems Success Model and Technology Acceptance Model in a Moroccan Healthcare Setting
by
Abdelaziz Ouajdouni, Khalid Chafik, Soukaina Allioui and Mourad Jbene
Big Data Cogn. Comput. 2024, 8(12), 180; https://doi.org/10.3390/bdcc8120180 - 3 Dec 2024
Abstract
►▼
Show Figures
This article aims to find the determinants that affect patient satisfaction regarding the Mawiidi public portal in Moroccan public hospitals and assess its outpatient online booking system effectiveness using a model that integrates the Technology Acceptance Model (TAM) with the Information Systems Success
[...] Read more.
This article aims to find the determinants that affect patient satisfaction regarding the Mawiidi public portal in Moroccan public hospitals and assess its outpatient online booking system effectiveness using a model that integrates the Technology Acceptance Model (TAM) with the Information Systems Success Model (ISSM) while adopting a quantitative research methodology. The analysis was conducted using 348 self-administered questionnaires to analyze eight key constructs, such as information quality, patient satisfaction, perceived ease of use, and privacy protection, among others. The results of PLS-SEM verified six out of eleven hypotheses tested, which reflected that information quality has a positive influence on perceived ease of use, which again enhances patient satisfaction. The major factors influencing the satisfaction and trust of patients in online appointment scheduling systems at public hospitals are highlighted. Indeed, privacy protection enhances patient satisfaction and trust. Service quality positively affects satisfaction but to a lesser degree. Website-related anxiety impacts perceived ease of use, although it has a limited influence on satisfaction. Such findings can inform suggestions for the managers of hospitals and portal designers to increase user satisfaction. This study uses a model from the TAM and ISSM frameworks, including cultural and socioeconomic aspects that apply to Morocco’s healthcare context.
Full article
Figure 1
Open AccessArticle
Exploring Named Entity Recognition via MacBERT-BiGRU and Global Pointer with Self-Attention
by
Chengzhe Yuan, Feiyi Tang, Chun Shan, Weiqiang Shen, Ronghua Lin, Chengjie Mao and Junxian Li
Big Data Cogn. Comput. 2024, 8(12), 179; https://doi.org/10.3390/bdcc8120179 - 3 Dec 2024
Abstract
Named Entity Recognition (NER) is a fundamental task in natural language processing that aims to identify and categorize named entities within unstructured text. In recent years, with the development of deep learning techniques, pre-trained language models have been widely used in NER tasks.
[...] Read more.
Named Entity Recognition (NER) is a fundamental task in natural language processing that aims to identify and categorize named entities within unstructured text. In recent years, with the development of deep learning techniques, pre-trained language models have been widely used in NER tasks. However, these models still face limitations in terms of their scalability and adaptability, especially when dealing with complex linguistic phenomena such as nested entities and long-range dependencies. To address these challenges, we propose the MacBERT-BiGRU-Self Attention-Global Pointer (MB-GAP) model, which integrates MacBERT for deep semantic understanding, BiGRU for rich contextual information, self-attention for focusing on relevant parts of the input, and a global pointer mechanism for precise entity boundary detection. By optimizing the number of attention heads and global pointer heads, our model achieves an effective balance between complexity and performance. Extensive experiments on benchmark datasets, including ResumeNER, CLUENER2020, and SCHOLAT-School, demonstrate significant improvements over baseline models.
Full article
(This article belongs to the Special Issue Research Progress in Artificial Intelligence and Social Network Analysis)
►▼
Show Figures
Figure 1
Journal Menu
► ▼ Journal Menu-
- BDCC Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Special Issues
- Topics
- Topical Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
BDCC, Digital, Information, Mathematics, Systems
Data-Driven Group Decision-Making
Topic Editors: Shaojian Qu, Ying Ji, M. Faisal NadeemDeadline: 31 December 2024
Topic in
BDCC, Data, Environments, Geosciences, Remote Sensing
Database, Mechanism and Risk Assessment of Slope Geologic Hazards
Topic Editors: Chong Xu, Yingying Tian, Xiaoyi Shao, Zikang Xiao, Yulong CuiDeadline: 28 February 2025
Topic in
Applied Sciences, BDCC, Future Internet, Information, Sci
Social Computing and Social Network Analysis
Topic Editors: Carson K. Leung, Fei Hao, Giancarlo Fortino, Xiaokang ZhouDeadline: 30 June 2025
Topic in
AI, BDCC, Fire, GeoHazards, Remote Sensing
AI for Natural Disasters Detection, Prediction and Modeling
Topic Editors: Moulay A. Akhloufi, Mozhdeh ShahbaziDeadline: 25 July 2025
Conferences
Special Issues
Special Issue in
BDCC
Unlocking Minds and Machines: Advances in Cognitive Computing and Big Data Analytics
Guest Editors: Dan Vilenchik, Havana RikaDeadline: 31 December 2024
Special Issue in
BDCC
Revolutionizing Healthcare: Exploring the Latest Advances in Digital Health Technology
Guest Editors: Hossein Hassani, Steve MacFeelyDeadline: 31 December 2024
Special Issue in
BDCC
Emerging Trends and Applications of Big Data in Robotic Systems
Guest Editors: Praneel Chand, Mansour Assaf, Mohammad Dabbagh, Nandini SidnalDeadline: 31 December 2024
Special Issue in
BDCC
Natural Language Processing Applications in Big Data
Guest Editors: Xingyi Song, Ye Jiang, Yunfei LongDeadline: 31 December 2024