Examining Heterogeneity Structured on a Large Data Volume with Minimal Incompleteness
Abstract
While Big Data analytics can provide a variety of benefits, processing heterogeneous data comes with its own set of limitations. A transaction pattern must be studied independently while working with Bitcoin data, this study examines twitter data related to Bitcoin and investigate communications pattern on bitcoin transactional tweet. Using the hashtags #Bitcoin or #BTC on Twitter, a vast amount of data was gathered, which was mined to uncover a pattern that everyone either (speculators, teaches, or the stakeholders) uses on Twitter to discuss Bitcoin transactions. This aim is to determine the direction of Bitcoin transaction tweets based on historical data. As a result, this research proposes using Big Data analytics to track Bitcoin transaction communications in tweets in order to discover a pattern. Hadoop platform MapReduce was used. The finding indicate that In the map step of the procedure, Hadoop's tokenize the dataset and parse them to the mapper where thirteen patterns were established and reduced to three patterns using the attributes previously stored data in the Hadoop context, one of which is the Emoji data that was left out in previous research discussions, but the text is only one piece of the puzzle on bitcoin transaction interaction, and the key part of it is “No certainty, only possibilities” in Bitcoin transactions
Downloads
References
Abubakar, A., El-Gammal M.T. and Alarood, A.A., 2020. End-to-end fullyinformed network nodes associated with 433 MHz outdoor propagation environment. International Journal of Computing and Digital Systems, 10, pp.1-19.
Alkatheeri, Y., Ameen, A., Isaac, O., Nusari, M., Duraisamy, B. and Khalifa, G.S., 2020. The effect of big data on the quality of decision-making in Abu Dhabi Government organisations. In: Data Management, Analytics and Innovation, Springer, Singapore, pp.231-248.
Blumberg, R. and Atre, S., 2003. The problem with unstructured data. Dm Review, 13(42-49), p.62.
Bridges, D., Pitiot, A., MacAskill, M.R. and Peirce, J.W., 2020. The timing mega-study: Comparing a range of experiment generators, both lab-based and online. PeerJ, 8, p.e9414.
Cappa, F., Oriani, R., Peruffo, E. and McCarthy, I., 2021. Big data for creating and capturing value in the digitalized environment: unpacking the effects of volume, variety, and veracity on firm performance. Journal of Product Innovation Management, 38(1), pp.49-67.
Casado, R. and Younas, M., 2015. Emerging trends and technologies in big data processing. Concurrency and Computation: Practice and Experience, 27(8), pp.2078-2091.
Christophides, V., Efthymiou, V., Palpanas, T., Papadakis, G. and Stefanidis, K., 2020. An overview of end-to-end entity resolution for big data. ACM Computing Surveys, 53(6), pp.1-42.
Dey, N., Das, H., Naik, B. and Behera, H.S., 2019. Big Data Analytics for Intelligent Healthcare Management, Academic Press, Cambridge, Massachusetts. Dutta, A., Kumar, S. and Basu, M., 2020. A gated recurrent unit approach to bitcoin price prediction. Journal of Risk and Financial Management, 13(2), p.23.
Dwyer, G.P., 2015. The economics of Bitcoin and similar private digital currencies. Journal of Financial Stability, 17, p.81-91.
Gandomi, A. and Haider, M., 2015. Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35, pp.137-144.
George, R. and Kabir, R., 2012. Heterogeneity in business groups and the corporate diversification firm performance relationship. Journal of Business Research, 65()3, pp.412-420.
Grover, P., Kar, A.K., Janssen, M. and Ilavarasan, P.V., 2019. Perceived usefulness, ease of use and user acceptance of blockchain technology for
digital transactions insights from user-generated content on Twitter. Enterprise Information Systems, 13(6), pp.771-800.
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A. and Khan, S.U., 2015. The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, pp.98-115.
Hu, H., Wen, Y., Chua, T.S. and Li, X., 2014. Toward scalable systems for big data analytics: A technology tutorial. IEEE Access, 2, pp.652-687.
Kamps, J. and Kleinberg, B., 2018. To the moon: defining and detecting cryptocurrency pump-and-dumps. Crime Science, 7(1), pp.1-18.
Kaushik, S., 2021. Bitcoin Tweets, Tweets with trending #Bitcoin and #btc hashtag. Available from: https://www.kaggle.com/kaushiksuresh147/bitcointweets [Last accessed on 2021 May].
Kazemi, I. and Hassanzadeh, F., 2020. Modelling multivariate, overdispersed count data with correlated and non-normal heterogeneity effects. Statistics and Operations Research Transactions, 1, pp.335-356.
Kikuchi, S., Kitao, S. and Mikoshiba, M., 2020. Who suffers from the COVID-19 shocks? Labor market heterogeneity and welfare consequences in Japan. Covid Economics, 40, pp.76-114.
Krithika, D.R. and Rohini, K., 2020. Blockchain with bigdata analytics. In: Intelligent Computing and Innovation on Data Science, Springer, Singapore, pp.403-409.
Kumar, A., Abhishek, K., Nerurkar, P., Khosravi, M.R., Ghalib, M.R. and Shankar, A., 2021. Big data analytics to identify illegal activities on bitcoin blockchain for IoMT. Personal and Ubiquitous Computing, 1, pp.1-12.
Lahmiri, S. and Bekiros, S., 2020. Big data analytics using multi-fractal wavelet leaders in high-frequency Bitcoin markets. Chaos, Solitons and Fractals, 131, p.109472.
Lee, A.D., Li, M. and Zheng, H., 2020. Bitcoin: Speculative asset or innovative technology? Journal of International Financial Markets, 67, p.101209.
Lugli, E., Roederer, M. and Cossarizza, A., 2010. Data analysis in flow cytometry: The future just started. Cytometry Part A, 77(7), pp.705-713.
Malik, A., Burney, A. and Ahmed, F., 2020. Acomparative study of unstructured data with SQL and NO-SQL database management systems. Journal of Computer and Communications, 8(4), pp.59-71.
Mattke, J., Maier, C., Reis, L. and Weitzel, T., 2021. Bitcoin investment: Amixed methods study of investment motivations. European Journal of Information Systems, 30(3), pp.261-285.
Pano, T. and Kashef, R., 2020. A complete VADER-based sentiment analysis of bitcoin (BTC) tweets during the era of COVID-19. Big Data and Cognitive Computing, 4(4), p.33.
Schulze, P., Unger, B., Beattie, C. and Gugercin, S., 2018. Data-driven structured realization. Linear Algebra and its Applications, 537, pp.250-286.
Sean, B., 2021. Twitter Hits 199 Million Users, Reports “Solid” Q1 Revenue. Available from: https://www.thewrap.com/twitter-hits-199-million-users-reportssolid-q1-revenue [Last accessed on 2021 May].
Shankhdhar, A., Singh, A.K., Naugraiya, S. and Saini, P.K., 2021. Bitcoin price alert and prediction system using various models. In: IOP Conference Series: Materials Science and Engineering. Vol. 1131. IOP Publishing, p.012009.
Thelwall, M., Buckley, K. and Paltoglou, G., 2011. Sentiment in twitter events. Journal of the American Society for Information Science and Technology, 62(2), pp.406-418.
Urrutia, A.L., González-González, C., Van Cauwelaert, E.M., Rosell, J.A., Barrios, L.G. and Benítez, M., 2020. Landscape heterogeneity of peasantmanaged agricultural matrices. Agriculture, Ecosystems and Environment, 292, p.106797.
Vaduva, C., Iapaolo, M. and Datcu, M., 2020. A Scientific Perspective on Big Data in Earth Observation. In: Principles of Data Science, Springer, Cham, pp.155-188.
Yue, L., Tian, D., Chen, W., Han, X. and Yin, M., 2020. Deep learning for heterogeneous medical data analysis. World Wide Web, 23(5), pp.2715-2737.
Yue, X., Shu, X., Zhu, X., Du, X., Yu, Z., Papadopoulos, D. and Liu, S., 2018. Bitextract: Interactive visualization for extracting bitcoin exchange intelligence. IEEE Transactions on Visualization and Computer Graphics, 25(1), pp.162-171.
Copyright (c) 2021 Nahla Aljojo

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License [CC BY-NC-SA 4.0] that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).