top of page

ITGS + Digital Society  IBDP

Digital Society Blog

3.1 DATA: Digital Society Study Guide

  • Writer: lukewatsonteach
    lukewatsonteach
  • Mar 31
  • 26 min read

Updated: May 15

DATA CONCEPTS (3.1) - EXAM PREPARATION


IB DP DIGITAL SOCIETY: NIGHT-BEFORE CRAM GUIDE


1. QUICK REFERENCE DIKW PYRAMID (15 minutes)

TASK: Create a pyramid diagram on a single page with these key definitions.

WISDOM - Application of knowledge with ethical judgment to make informed decisions

KNOWLEDGE - Information that has been understood and applied to recognise patterns

INFORMATION - Data that has been processed and organised into meaningful context

DATA - Raw, unprocessed facts or values without context (e.g., binary code)


DIKW PYRAMID
DIKW PYRAMID

2. CORE DATA CONCEPTS FLASHCARDS (25 minutes)

TASK: Create 15 two-sided flashcards with concept on front, definition + example on back.


TYPES OF DATA CARDS (6):

  1. Quantitative Data - Numerical data that can be measured (e.g., website visits, follower counts)

  2. Qualitative Data - Non-numerical descriptive data (e.g., user reviews, comments)

  3. Cultural Data - Artistic expressions, language, social norms (e.g., Spotify listening patterns)

  4. Geographical Data - Location-based information (e.g., GPS tracking data, digital maps)

  5. Metadata - Data about data (e.g., photo EXIF data showing location, device, time)

  6. Scientific Data - Research information (e.g., digital telescope readings, genomic sequences)


USES OF DATA CARDS (4):

  1. Trend Identification - Discovering patterns over time (e.g., viral content patterns)

  2. Pattern Recognition - Finding regularities in data (e.g., consumer behaviors)

  3. Connection Mapping - Identifying relationships (e.g., linking influencers to demographics)

  4. Relationship Analysis - Understanding correlations (e.g., online behavior vs. purchases)


DATA LIFE CYCLE CARDS (5):

  1. Create/Collect - Initial data generation (e.g., social media harvesting)

  2. Store - Saving data for future use (e.g., cloud databases)

  3. Process - Cleaning and preparing data (e.g., algorithm filtering)

  4. Analyze - Examining data to discover information (e.g., AI pattern recognition)

  5. Access/Preserve/Reuse - Retrieving, maintaining, and repurposing data (e.g., data marketplaces)


TIP: Study your cards in sets, then mix them up. Focus on giving a unique digital example for each concept.


3. CRITICAL TERMINOLOGY MAP (20 minutes)

TASK: Create a mind map connecting the most frequently tested terms. Use a single sheet of paper and colored pens.


CENTRAL HUB: DATA CONCEPTS IN DIGITAL SOCIETY


BRANCH 1: DATA TYPES

  • Connect: Quantitative → Statistics → Big Data

  • Connect: Qualitative → Cultural Insights → Social Patterns

  • Connect: Metadata → Privacy Concerns → PII


BRANCH 2: DATA MANAGEMENT

  • Connect: Collection → Storage → Processing → Analysis

  • Connect: Databases → Classification → Relationships

  • Connect: Primary Collection → Secondary Use


BRANCH 3: DATA REPRESENTATION

  • Connect: Visualization → Charts → Infographics

  • Connect: Reports → Decision Making → Wisdom


BRANCH 4: DATA CONCERNS

  • Connect: Security → Encryption → Blockchain

  • Connect: Privacy → Anonymity → Surveillance

  • Connect: Bias → Reliability → Integrity


TIP: Draw lines between related concepts across different branches. These connections often form the basis of higher-mark questions!


4. EXAMINATION HOT TOPICS (30 minutes)

TASK: Create detailed notes on these three commonly tested areas.


DATA DILEMMAS (10 min):

  • Data Bias: Systematic errors leading to unfair outcomes

    • Example: Facial recognition systems failing for certain ethnicities

    • Test topic: Identifying bias in recommendation algorithms

  • Data Ownership: Who controls and has rights to data

    • Example: Social media platforms claiming ownership of user content

    • Test topic: Conflicts between personal and corporate data rights

  • Data Privacy: Protection from unauthorized access

    • Example: Location tracking without informed consent

    • Test topic: PII (Personally Identifiable Information) management


BIG DATA - 4 Vs (10 min):

  • Volume: Massive scale of data collected

    • Example: Billions of daily social media interactions

    • Test topic: Storage and processing challenges

  • Variety: Different types and formats of data

    • Example: Text, images, video, location data combined

    • Test topic: Integration and analysis difficulties

  • Velocity: Speed of data generation and processing

    • Example: Real-time streaming analytics

    • Test topic: Requirements for instant decision-making

  • Veracity: Accuracy and trustworthiness of data

    • Example: Fake accounts vs. authentic user data

    • Test topic: Methods of ensuring data quality


DATA SECURITY ESSENTIALS (10 min):

  • Encryption: Converting data into protected code

    • Example: End-to-end encrypted messaging

    • Test topic: Public vs. private key approaches

  • Data Masking: Hiding sensitive information

    • Example: Credit card numbers displayed as --****-1234

    • Test topic: Maintaining utility while protecting privacy

  • Blockchain: Distributed ledger technology

    • Example: NFT verification of digital ownership

    • Test topic: Tamper-proof record keeping


TIP: For each hot topic, memorise ONE real-world example and TWO ethical implications.


5. QUESTION ATTACK STRATEGY (30 minutes)

TASK: Practice applying the DIKES framework to sample questions.


D.I.K.E.S. FRAMEWORK:

  • Define the key terms in the question

  • Identify the command term (describe, explain, evaluate)

  • Knowledge retrieval (recall relevant concepts)

  • Examples from digital society contexts

  • Structure answer appropriate to mark allocation


PRACTICE WITH SAMPLE AO1 QUESTIONS (2 marks each):

  1. Define metadata in a digital context.

  2. Outline two types of quantitative data collected by social media platforms.

  3. State two characteristics of big data.

  4. Identify two stages in the data life cycle.

  5. Define data encryption.

  6. State two examples of data masking techniques.

  7. Outline the difference between primary and secondary data collection.

  8. Define data integrity in digital environments.

  9. State two ways data can be represented visually.

  10. Identify two ethical concerns related to personal data.


PRACTICE WITH SAMPLE AO2 QUESTIONS (4 marks each):

  1. Explain how the DIKW pyramid applies to a digital shopping platform.

  2. Analyze how metadata can create privacy concerns in smartphone applications.

  3. Discuss how data bias might affect automated decision-making systems.

  4. Evaluate the importance of data security in financial technology applications.

  5. Explain how big data analytics can be used to understand online behavior.

  6. Discuss the ethical implications of data ownership on social media platforms.

  7. Analyze the relationship between data velocity and real-time decision making.

  8. Evaluate the effectiveness of blockchain in ensuring data integrity.

  9. Explain how different data visualization techniques affect understanding.

  10. Discuss the tension between data collection and user privacy.


TIP: For 2-mark questions, write 2 distinct points. For 4-mark questions, include at least one specific example and one implication.


DATA CONCEPTS (3.1) - COMPREHENSIVE PREPARATION


1. CONCEPTUAL FRAMEWORK: DIKW IN DIGITAL SOCIETY


Extended Definitions with Digital Context

Data in Digital Society: Data represents the fundamental building blocks of the digital world - raw, unprocessed facts, signals, or values lacking context or inherent meaning. In digital environments, this manifests as binary code, unformatted numbers, unstructured text, or raw sensor readings. Data alone carries minimal value until it's processed and contextualized.


Information in Digital Society: When data is processed, organized, structured, or presented within a meaningful context, it transforms into information. Digital platforms constantly perform this transformation, converting raw signals into meaningful indicators. The critical difference is that information answers basic questions (who, what, where, when) that data alone cannot address.


Knowledge in Digital Society: Knowledge emerges when information is interpreted, understood, and applied within frameworks of understanding. Digital systems facilitate knowledge creation by enabling pattern recognition and connection identification across vast datasets. Knowledge answers "how" questions by revealing the methods, processes, and relationships between different information points.


Wisdom in Digital Society: The pinnacle of the hierarchy, wisdom, involves applying knowledge with insight, judgment, and ethical consideration. In digital contexts, wisdom manifests as informed decision-making that considers long-term consequences, ethical implications, and human values. Wisdom addresses "why" questions and guides appropriate actions based on accumulated knowledge.


Case Studies Showing Progression from Data to Wisdom and Exam Practice Questions

Case Study 1: Smart City Transportation

  • Data Layer: Raw GPS coordinates from thousands of vehicles, traffic light status signals, pedestrian counting sensors, weather station readings.

  • Information Layer: Current traffic density on specific roads, average vehicle speeds, pedestrian volumes at intersections, real-time weather conditions.

  • Knowledge Layer: Recognition of traffic patterns (rush hour flows, event-related congestion), understanding how weather affects transportation choices, identification of accident-prone areas.

  • Wisdom Layer: Implementation of dynamic traffic management systems that balance efficiency with environmental impact, equity of access, and community well-being; long-term urban planning decisions that prioritize sustainable mobility solutions.


1. EVALUATE the ethical implications of using the DIKW model in smart city transportation systems.

  • Examination Tips:

    • Define the DIKW pyramid in relation to transportation data

    • Analyze benefits (efficiency, sustainability, resource optimization) AND drawbacks (surveillance, privacy risks, digital divides)

    • Consider multiple stakeholder perspectives (city planners, citizens, privacy advocates, businesses)

    • Evaluate how moving from data to wisdom changes ethical considerations

    • Conclude with a balanced judgment on conditions for ethical implementation


2. TO WHAT EXTENT should automated wisdom-level decision making be implemented in smart city transportation systems?

  • Examination Tips:

    • Distinguish between different DIKW levels in transportation context

    • Analyze appropriate areas for automation AND where human oversight remains essential

    • Consider technical limitations, ethical implications, and social factors

    • Include specific examples of automated vs. human decision points

    • Reach a nuanced conclusion about the boundaries of automation in urban planning


Case Study 2: Digital Health Platform

  • Data Layer: Heart rate readings, step counts, sleep duration measurements, food logging entries, medical test results.

  • Information Layer: Daily activity levels, sleep quality indicators, caloric intake summaries, health metric trends over time.

  • Knowledge Layer: Understanding correlations between exercise and sleep quality, recognizing patterns between dietary choices and energy levels, identifying potential health risk factors.

  • Wisdom Layer: Development of personalized health recommendations that consider individual circumstances, ethical handling of sensitive health information, balanced approach to technology-assisted wellness that promotes genuine well-being rather than anxiety or obsession.


3. DISCUSS how the transformation from raw health data to actionable wisdom affects different stakeholders in digital healthcare ecosystems.

  • Examination Tips:

    • Identify key stakeholders (patients, healthcare providers, insurance companies, platform developers)

    • Analyze how each stakeholder's interests and concerns change across DIKW levels

    • Consider power dynamics in who controls the transformation process

    • Include specific examples of benefits and risks at each level

    • Present multiple perspectives on data ownership and algorithmic recommendations


4. COMPARE AND CONTRAST the regulatory approaches needed at different levels of the DIKW pyramid in digital health platforms.

  • Examination Tips:

    • Distinguish regulatory needs for raw data vs. processed information vs. applied wisdom

    • Analyze similarities in protection needs across all levels

    • Identify key differences in regulatory approaches required (technical standards vs. ethical frameworks)

    • Consider global variations in health data regulation

    • Discuss how context and cultural factors affect appropriate governance models


Case Study 3: Social Media Analytics

  • Data Layer: Click events, view durations, scroll patterns, reaction selections, comment text, share actions.

  • Information Layer: Engagement rates, popular content categories, demographic breakdowns, sentiment analysis results.

  • Knowledge Layer: Understanding content virality factors, recognition of community formation patterns, identification of influence networks, awareness of polarization dynamics.

  • Wisdom Layer: Platform design decisions that promote healthy discourse over engagement maximization, content moderation approaches that balance free expression with harm prevention, algorithmic recommendation systems that consider long-term user well-being and societal impacts.


5. EXAMINE how the progression from data to wisdom in social media analytics influences societal polarization and information bubbles.

  • Examination Tips:

    • Explain how each DIKW level contributes to content curation and recommendation

    • Analyze the relationship between algorithmic wisdom and information diversity

    • Consider intended consequences AND unintended effects on social cohesion

    • Include specific examples of how platforms translate engagement data into content decisions

    • Evaluate different approaches to balancing personalization with information diversity


6. TO WHAT EXTENT are social media companies responsible for the wisdom-level outcomes of their data processing systems?

  • Examination Tips:

    • Define the scope of corporate responsibility across different DIKW levels

    • Analyze arguments for expanded responsibility (platform power, societal impact) AND limited responsibility (user agency, free expression)

    • Consider legal, ethical, and practical dimensions of responsibility

    • Include specific examples of platform policies and their effects

    • Develop a nuanced position on where responsibility boundaries should lie


2. DATA TAXONOMY EXPLORER

Comprehensive Classification of Data Types

Quantitative Data in Digital Society:

  • Definition: Numerical data that can be measured and analysed using statistical methods.

  • Subcategories:

    • Discrete: Countable values (e.g., number of website visits, download counts)

    • Continuous: Measurable values on a scale (e.g., time spent on apps, scroll depth percentages)

    • Ordinal: Ranked numerical values (e.g., star ratings, satisfaction scores)

    • Ratio: Values with meaningful zero points (e.g., file sizes, data transfer speeds)


Qualitative Data in Digital Society:

  • Definition: Non-numerical data that describes qualities or characteristics.

  • Subcategories:

    • Textual: Written information (e.g., user comments, reviews, forum discussions)

    • Visual: Image-based information (e.g., photos, infographics, visual designs)

    • Auditory: Sound-based information (e.g., voice recordings, audio preferences)

    • Behavioral: Action-based information (e.g., navigation patterns, feature usage)


Cultural Data in Digital Society:

  • Definition: Information related to artistic expressions, traditions, language use, and social norms.

  • Subcategories:

    • Creative Works: (e.g., digital art, music streams, fiction)

    • Linguistic Patterns: (e.g., evolving online language, emoji usage trends)

    • Value Expressions: (e.g., cause-related engagement, community standards)

    • Tradition Documentation: (e.g., digital archives of cultural practices)


Metadata in Digital Society:

  • Definition: Data about data that provides information about characteristics of other data.

  • Subcategories:

    • Descriptive: Information about content (e.g., file names, tags, titles)

    • Structural: Information about organization (e.g., file formats, data relationships)

    • Administrative: Information about management (e.g., creation dates, permission settings)

    • Technical: Information about systems (e.g., device specifications, software versions)


Real-World Examples Across Platforms

Social Media Platforms Data Ecosystem:

  • Quantitative: Follower counts, engagement rates, impression numbers, video completion rates

  • Qualitative: Comment sentiment, content themes, visual aesthetics, conversation topics

  • Cultural: Meme evolution, platform-specific language, community norms, trending topics

  • Metadata: Post timestamps, location tags, device information, edit history


E-Commerce Platform Data Ecosystem:

  • Quantitative: Pricing data, inventory levels, conversion rates, average order values

  • Qualitative: Product reviews, customer feedback, support conversations, return reasons

  • Cultural: Gift-giving patterns, seasonal preferences, regional purchase variations

  • Metadata: Browser information, session duration, shopping cart evolution, wishlist history


Streaming Service Data Ecosystem:

  • Quantitative: Viewing durations, subscription metrics, content completion rates, peak usage times

  • Qualitative: Genre preferences, content ratings, viewing contexts, search queries

  • Cultural: Regional content popularity, language preferences, viewing rituals, co-viewing habits

  • Metadata: Device type, streaming quality settings, pause/resume patterns, watchlist organisation


Smart City Data Ecosystem:

  • Quantitative: Traffic volumes, energy consumption, public service utilisation, environmental readings

  • Qualitative: Resident feedback, community priorities, quality of life indicators, public space usage

  • Cultural: Event attendance, community engagement patterns, neighbourhood characteristics

  • Metadata: Temporal patterns, spatial distributions, system interconnections, data provenance


3. DATA LIFE CYCLE ANALYSIS

Detailed Breakdown of Each Stage

1. Create/Collect/Extract Stage:

  • Definition: The initial phase where data is generated, gathered, or pulled from various digital sources.

  • Digital Processes:

    • Active collection through user inputs (forms, uploads, surveys)

    • Passive collection through sensors and tracking tools (cookies, IoT devices)

    • Algorithmic generation (synthetic data, simulations)

    • API-based extraction from external platforms

    • Web scraping of publicly available information

  • Key Technologies:

    • Data collection APIs

    • Web crawlers and scrapers

    • IoT sensor networks

    • Mobile device SDKs

    • Input form systems


2. Storage Stage:

  • Definition: The phase where data is saved in digital repositories, databases, or storage systems.

  • Digital Processes:

    • Database indexing and organization

    • Cloud storage allocation

    • Backup creation and management

    • Archive classification

    • Redundancy implementation

  • Key Technologies:

    • Relational databases (SQL)

    • NoSQL databases (document, graph, key-value)

    • Data lakes and warehouses

    • Distributed storage systems

    • Blockchain ledgers


3. Processing Stage:

  • Definition: The stage where raw data is cleaned, transformed, and prepared for analysis.

  • Digital Processes:

    • Data cleaning (removing errors, duplicates)

    • Normalization and standardization

    • Transformation into usable formats

    • Feature extraction and engineering

    • Aggregation and summarization

  • Key Technologies:

    • ETL (Extract, Transform, Load) pipelines

    • Data processing frameworks (Apache Spark, Hadoop)

    • Machine learning preprocessing libraries

    • Automated data quality tools

    • Stream processing systems


4. Analysis Stage:

  • Definition: The examination of data using digital tools and techniques to discover useful information.

  • Digital Processes:

    • Statistical analysis and testing

    • Pattern recognition and trend identification

    • Predictive modeling and forecasting

    • Network and relationship mapping

    • Anomaly detection and outlier analysis

  • Key Technologies:

    • Business intelligence platforms

    • Machine learning algorithms

    • Natural language processing systems

    • Network analysis tools

    • Visualization frameworks


5. Access Stage:

  • Definition: The retrieval of data through digital interfaces, queries, or applications.

  • Digital Processes:

    • User authentication and authorization

    • Query optimization and execution

    • Real-time data serving

    • Access control enforcement

    • Information delivery formatting

  • Key Technologies:

    • API gateways and management systems

    • Dashboard platforms

    • Mobile app interfaces

    • Search engines and recommendation systems

    • Data marketplaces


6. Preservation Stage:

  • Definition: The maintenance of data integrity and availability over time.

  • Digital Processes:

    • Version control implementation

    • Format migration for longevity

    • Integrity validation and verification

    • Historical record maintenance

    • Degradation prevention

  • Key Technologies:

    • Digital archives and preservation systems

    • Content-addressed storage

    • Cryptographic verification tools

    • Temporal databases

    • Immutable storage solutions


7. Reuse Stage:

  • Definition: The application of existing data for new purposes or combinations.

  • Digital Processes:

    • Data sharing and distribution

    • Dataset combination and integration

    • Repurposing for secondary analysis

    • Knowledge transfer and application

    • Open data publication

  • Key Technologies:

    • Data exchange formats and standards

    • Open data portals

    • Data licensing frameworks

    • Interoperability protocols

    • Data marketplace platforms


Ethical Considerations at Each Point

Create/Collect/Extract - Ethical Considerations:

  • Informed Consent: Are subjects aware of what data is being collected and how it will be used?

  • Power Dynamics: Does the collector have disproportionate power over those from whom data is collected?

  • Cultural Sensitivity: Does collection respect cultural norms and sensitivities?

  • Minimization: Is only necessary data being collected, or is collection excessive?

  • Case Example: Cambridge Analytica's harvesting of Facebook user data without proper consent


Storage - Ethical Considerations:

  • Security Responsibility: Are adequate protections in place to prevent unauthorized access?

  • Duration Limitations: Is data being stored longer than necessary?

  • Right to be Forgotten: Can individuals request deletion of their personal data?

  • Environmental Impact: What is the energy and resource cost of maintaining massive data storage?

  • Case Example: Equifax data breach exposing personal information of 147 million people


Processing - Ethical Considerations:

  • Transparency: Are processing methods disclosed to stakeholders?

  • Data Integrity: Is processing maintaining the original context and meaning?

  • Bias Prevention: Are processing methods introducing or amplifying biases?

  • Resource Allocation: Who benefits from the resource-intensive data processing?

  • Case Example: ProPublica's discovery of racial bias in recidivism risk assessment algorithms


Analysis - Ethical Considerations:

  • Interpretation Responsibility: Are conclusions drawn responsibly and within the limits of the data?

  • Algorithmic Accountability: Who is responsible for automated analytical decisions?

  • Pluralistic Perspectives: Are diverse viewpoints considered in analysis approaches?

  • Correlation vs. Causation: Are causal claims made appropriately?

  • Case Example: Target predicting customer pregnancies based on shopping patterns, raising privacy concerns


Access - Ethical Considerations:

  • Equitable Access: Is data access distributed fairly across different communities?

  • Digital Divide: Are access methods considerate of varying levels of digital literacy?

  • Privacy Boundaries: Is access to sensitive information properly restricted?

  • Accessibility: Are data interfaces usable by people with disabilities?

  • Case Example: Health data accessibility disparities during the COVID-19 pandemic


Preservation - Ethical Considerations:

  • Historical Accuracy: Is context preserved alongside raw data?

  • Generational Responsibility: How will future generations interpret and use preserved data?

  • Right to Change: Should people be able to update historical data about themselves?

  • Resource Sustainability: Is long-term preservation ecologically sustainable?

  • Case Example: Internet Archive's preservation of deleted government climate data


Reuse - Ethical Considerations:

  • Purpose Expansion: Is data being used for purposes beyond original consent?

  • Attribution: Are original data sources properly credited?

  • Combination Effects: Does merging datasets create new privacy or ethical concerns?

  • Open Access Balance: How to balance openness with protection of sensitive information?

  • Case Example: Medical research data being repurposed by pharmaceutical companies for commercial gain


4. DATA REPRESENTATION WORKSHOP

Techniques and Tools for Visualisation

Statistical Charts and Graphs:

  • Bar Charts/Histograms:

    • Best for: Comparing categorical data, showing distributions

    • Digital Examples: Social media engagement by platform, website traffic by source

    • Tools: Tableau, D3.js, Excel, Google Charts

  • Line Charts/Time Series:

    • Best for: Showing trends over time, continuous relationships

    • Digital Examples: User growth, content consumption patterns, usage fluctuations

    • Tools: Grafana, Highcharts, R (ggplot2), Python (Matplotlib)

  • Pie/Donut Charts:

    • Best for: Showing composition and proportions of a whole

    • Digital Examples: Device type distribution, feature usage breakdown

    • Tools: Chart.js, Google Data Studio, Power BI

  • Scatter Plots:

    • Best for: Showing relationships between two variables

    • Digital Examples: Correlation between time spent and engagement, price vs. rating

    • Tools: Plotly, Python (Seaborn), Tableau


Interactive Visualizations:

  • Dashboards:

    • Best for: Combining multiple metrics in real-time monitoring

    • Digital Examples: Business analytics platforms, social media management tools

    • Tools: Kibana, Looker, Domo, Databox

  • Interactive Maps:

    • Best for: Geographical data, spatial relationships

    • Digital Examples: User distribution maps, service coverage areas, location-based analytics

    • Tools: Mapbox, Leaflet, QGIS, ArcGIS Online

  • Heat Maps:

    • Best for: Showing intensity variations across two dimensions

    • Digital Examples: Website click tracking, engagement hotspots, attention mapping

    • Tools: Hotjar, Crazy Egg, VWO

  • Tree Maps:

    • Best for: Hierarchical data showing relationships and proportions

    • Digital Examples: File storage usage, content categories, market segmentation

    • Tools: Treemap.js, Google Charts, Tableau


Network and Relationship Visualizations:

  • Network Graphs:

    • Best for: Showing connections and relationships between entities

    • Digital Examples: Social networks, influence mapping, website link structures

    • Tools: Gephi, Cytoscape, SigmaJS, Neo4j Bloom

  • Sankey Diagrams:

    • Best for: Visualizing flows and transitions between states

    • Digital Examples: User journeys, conversion funnels, information flows

    • Tools: Google Charts, D3.js Sankey, RAWGraphs

  • Chord Diagrams:

    • Best for: Showing inter-relationships between categories

    • Digital Examples: Cross-platform user movement, content recommendation patterns

    • Tools: D3.js, Circos, Flourish


Advanced Visualization Techniques:

  • Infographics:

    • Best for: Combining data with narrative and design elements

    • Digital Examples: Annual reports, trend summaries, educational content

    • Tools: Canva, Piktochart, Adobe Illustrator, Infogram

  • Data Storytelling Platforms:

    • Best for: Creating interactive narratives driven by data

    • Digital Examples: Journalistic investigations, public interest reporting, annual reviews

    • Tools: Flourish Story, Shorthand, ScrollStory

  • Real-time Visualizations:

    • Best for: Monitoring dynamic systems and immediate feedback

    • Digital Examples: Social media command centers, IoT dashboards, system monitoring

    • Tools: SignalFx, Datadog, Grafana Live

  • Immersive Visualizations:

    • Best for: Complex multidimensional data exploration

    • Digital Examples: VR data environments, augmented reality overlays, data sculptures

    • Tools: Unity3D, A-Frame, Unreal Engine


5. SECURITY AND PRIVACY DEEP DIVE

Technologies and Approaches

Data Encryption Methods:

  • Symmetric Encryption:

    • How it Works: Uses the same key for encryption and decryption

    • Digital Applications: File encryption, database security, communication sessions

    • Examples: AES, 3DES, ChaCha20

  • Asymmetric Encryption:

    • How it Works: Uses public-private key pairs for secure communication

    • Digital Applications: Secure messaging, digital signatures, authentication

    • Examples: RSA, ECC, PGP

  • End-to-End Encryption:

    • How it Works: Only communicating users can read messages; platforms cannot access content

    • Digital Applications: Secure messaging apps, video calls, email

    • Examples: Signal Protocol, WhatsApp, ProtonMail

  • Homomorphic Encryption:

    • How it Works: Allows computations on encrypted data without decryption

    • Digital Applications: Privacy-preserving analytics, secure cloud computing

    • Examples: IBM HElib, Microsoft SEAL


Data Masking Techniques:

  • Substitution:

    • How it Works: Replacing sensitive data with fictional but realistic values

    • Digital Applications: Test environments, analytics datasets

    • Examples: Credit card numbers displayed as XXXX-XXXX-XXXX-1234

  • Shuffling:

    • How it Works: Rearranging sensitive data within a dataset

    • Digital Applications: Research databases, development environments

    • Examples: Randomizing customer records while maintaining overall distribution

  • Tokenization:

    • How it Works: Replacing sensitive data with non-sensitive placeholders

    • Digital Applications: Payment processing, healthcare systems

    • Examples: Apple Pay tokens replacing actual credit card numbers

  • Redaction:

    • How it Works: Completely removing or blacking out sensitive information

    • Digital Applications: Document sharing, public record releases

    • Examples: PDF redaction tools, automated PII scanners


Data Erasure Methods:

  • Digital Shredding:

    • How it Works: Overwriting data multiple times to prevent recovery

    • Digital Applications: Device decommissioning, sensitive file deletion

    • Examples: DoD 5220.22-M standard, Gutmann method

  • Crypto-shredding:

    • How it Works: Destroying encryption keys, making encrypted data unreadable

    • Digital Applications: Cloud storage, distributed systems

    • Examples: Key rotation policies with secure deletion of old keys

  • Right to be Forgotten Implementation:

    • How it Works: Systematically removing personal data across systems

    • Digital Applications: User account deletion, compliance with privacy laws

    • Examples: GDPR compliance systems, platform account deletion tools


Blockchain Security Applications:

  • Immutable Records:

    • How it Works: Tamper-evident ledgers using cryptographic hashing

    • Digital Applications: Supply chain verification, credential verification

    • Examples: Bitcoin, Ethereum, Hyperledger

  • Smart Contracts:

    • How it Works: Self-executing contracts with the terms directly written into code

    • Digital Applications: Automated agreements, conditional transactions

    • Examples: Ethereum smart contracts, Solidity programming language

  • Zero-Knowledge Proofs:

    • How it Works: Proving knowledge of information without revealing the information itself

    • Digital Applications: Identity verification, private transactions

    • Examples: Zcash, zk-SNARKs


Regulatory Frameworks

Global Privacy Regulations:

  • General Data Protection Regulation (GDPR):

    • Key Provisions: Right to access, right to be forgotten, data portability, consent requirements

    • Territorial Scope: EU residents' data regardless of company location

    • Impact on Digital Society: Standardized privacy notices, consent management platforms, data protection officers

  • California Consumer Privacy Act (CCPA):

    • Key Provisions: Right to know, right to delete, right to opt-out of data sales

    • Territorial Scope: Businesses serving California residents that meet certain thresholds

    • Impact on Digital Society: "Do Not Sell My Data" buttons, privacy policy updates, consumer rights portals

  • Personal Information Protection and Electronic Documents Act (PIPEDA):

    • Key Provisions: Consent requirements, purpose limitation, individual access

    • Territorial Scope: Canadian private-sector organizations

    • Impact on Digital Society: Privacy by design implementation, breach notification protocols

  • Brazil's Lei Geral de Proteção de Dados (LGPD):

    • Key Provisions: Similar to GDPR with Brazilian context

    • Territorial Scope: Organizations handling Brazilian citizens' data

    • Impact on Digital Society: Data protection officers, legal basis documentation


Industry-Specific Regulations:

  • Health Insurance Portability and Accountability Act (HIPAA):

    • Key Provisions: Privacy Rule, Security Rule, Breach Notification Rule

    • Digital Applications: Electronic health records, health apps, telemedicine

    • Impact: Standardized data security in healthcare, patient access portals

  • Children's Online Privacy Protection Act (COPPA):

    • Key Provisions: Parental consent for data collection from children under 13

    • Digital Applications: Social media, gaming platforms, educational technology

    • Impact: Age verification systems, limited data collection from minors

  • Payment Card Industry Data Security Standard (PCI DSS):

    • Key Provisions: Secure network requirements, vulnerability management, access control

    • Digital Applications: E-commerce platforms, payment processors, financial apps

    • Impact: Tokenization of payment data, secure checkout protocols


Emerging Regulatory Approaches:

  • Algorithmic Accountability:

    • Key Concepts: Transparency requirements, impact assessments, discrimination testing

    • Digital Applications: AI systems, automated decision-making, recommendation engines

    • Examples: EU AI Act, NYC Algorithmic Accountability Law

  • Data Sovereignty:

    • Key Concepts: Geographic restrictions on data storage and processing

    • Digital Applications: Cloud services, multinational operations, data transfers

    • Examples: EU-US Data Privacy Framework, China's Data Security Law

  • Digital Identity Frameworks:

    • Key Concepts: Standardized identity verification, self-sovereign identity

    • Digital Applications: Government services, financial verification, cross-platform authentication

    • Examples: eIDAS (EU), India's Aadhaar system


Case Studies of Successes and Failures

Success Case Study: Apple's Privacy-Centric Approach

  • Context: Apple implemented App Tracking Transparency (ATT) requiring explicit user consent for tracking across apps and websites.

  • Technical Implementation: iOS privacy labels, tracking permission popups, privacy-preserving analytics

  • Outcome: Increased user control, disrupted targeted advertising industry, established privacy as competitive advantage

  • Digital Society Impact: Normalized opt-in consent models, created market pressure for privacy features


Success Case Study: Signal Messenger's Security Architecture

  • Context: Signal developed as a privacy-focused alternative to traditional messaging apps

  • Technical Implementation: End-to-end encryption, minimal metadata storage, open-source protocol

  • Outcome: Growth in user base during privacy controversies, widely adopted protocol, security expert endorsements

  • Digital Society Impact: Demonstrated viable business model for privacy-first services, influenced features of mainstream platforms


Failure Case Study: Equifax Data Breach

  • Context: 2017 breach exposed personal data of 147 million people

  • Technical Failures: Unpatched vulnerabilities, inadequate network segmentation, delayed detection

  • Outcome: $700 million settlement, reputation damage, regulatory scrutiny

  • Digital Society Impact: Highlighted inadequacy of existing data protection practices, strengthened breach notification laws


Failure Case Study: Facebook/Cambridge Analytica Scandal

  • Context: Political consulting firm harvested data from millions of Facebook users without consent

  • Technical/Policy Failures: Overly permissive API access, inadequate third-party monitoring, insufficient consent mechanisms

  • Outcome: $5 billion FTC fine, damaged trust, Congressional hearings

  • Digital Society Impact: Catalyzed GDPR enforcement, raised awareness of data harvesting practices, inspired new privacy legislation



GLOSSARY OF DATA CONCEPTS IN DIGITAL SOCIETY (3.1)

3.1A Data, Information, Knowledge, and Wisdom

Data: Raw, unprocessed facts, signals, or values without context or meaning in digital environments (e.g., binary code, unformatted numbers, unstructured text).

Information: Data that has been processed, organized, structured, or presented in a meaningful context to make it useful in digital settings.

Knowledge: Information that has been interpreted, understood, and applied within a framework of understanding, enabling patterns and connections to be recognized across digital platforms.

Wisdom: The application of knowledge with insight, judgment, and ethical consideration to make informed decisions in digital society contexts.

DIKW Pyramid: A hierarchical model representing the relationships between Data, Information, Knowledge, and Wisdom, showing how each builds upon the previous level with increasing value and complexity in digital environments.


3.1B Types of Data

Quantitative Data: Numerical data that can be measured and analyzed using statistical methods (e.g., website visits, download counts, follower numbers).

Qualitative Data: Non-numerical data that describes qualities or characteristics (e.g., user comments, reviews, interview responses).

Cultural Data: Information related to artistic expressions, traditions, language use, and social norms in digital spaces.

Financial Data: Digital records of monetary transactions, investments, cryptocurrencies, and economic behaviors.

Geographical Data: Location-based information including coordinates, mapping data, and spatial relationships collected through digital means.

Medical Data: Health-related information collected through digital platforms, including electronic health records and biometric data from wearable devices.

Meteorological Data: Weather and climate information gathered through digital sensors, satellites, and monitoring systems.

Transport Data: Information about movement patterns, vehicle usage, and transportation networks collected through digital tracking systems.

Scientific Data: Information gathered through digital research tools, computational models, and experimental platforms.

Statistical Data: Numerical data collections analyzed to reveal patterns and trends across digital populations.

Metadata: Data about data that provides information about the characteristics of other data, such as creation time, author, size, and format in digital files.


3.1C Uses of Data

Trend Identification: The process of discovering patterns or movements in directions over time within digital datasets.

Pattern Recognition: The identification of recurring structures or regularities in digital data.

Connection Mapping: The process of identifying relationships between different digital entities or data points.

Relationship Analysis: Examination of how different data elements influence or correlate with each other in digital ecosystems.

Measurable Facts: Quantifiable information about digital behaviors, preferences, and characteristics that can be objectively gathered.


3.1D Data Life Cycle

Create/Collect/Extract: The initial phase where data is generated, gathered, or pulled from various digital sources.

Store: The phase where data is saved in digital repositories, databases, or storage systems for future use.

Process: The stage where raw data is cleaned, transformed, and prepared for analysis in digital systems.

Analyze: The examination of data using digital tools and techniques to discover useful information and draw conclusions.

Access: The retrieval of data through digital interfaces, queries, or applications by authorized users.

Preserve: The maintenance of data integrity and availability over time through digital archiving methods.

Reuse: The application of existing data for new purposes or in combination with other datasets across digital platforms.


3.1E Ways to Collect and Organize Data

Primary Data Collection: Gathering original data directly from digital sources for a specific purpose.

Secondary Data Collection: Using existing digital data that was originally collected for other purposes.

Database: A structured digital system for organizing, storing, and retrieving data efficiently.

Data Classification: The process of categorizing digital data based on characteristics, sensitivity, or purpose.

Data Relationships: The connections and associations between different data elements within digital structures.


3.1F Ways of Representing Data

Charts: Visual representations of data using graphical elements like lines, bars, or circles in digital formats.

Tables: Arrangements of data in rows and columns to organize and display information digitally.

Reports: Formatted presentations of data analysis with interpretations and context for digital distribution.

Infographics: Visual representations that combine data visualizations with design elements to convey complex information quickly in digital media.

Visualizations: Interactive or static graphical representations of data designed to make complex information more accessible and understandable in digital environments.


3.1G Data Security

Encryption: The process of converting data into a code to prevent unauthorized access in digital systems.

Data Masking: The technique of hiding original data with modified content to protect sensitive information while maintaining functional utility.

Data Erasure: The permanent removal of digital data from storage devices or online platforms.

Blockchain: A decentralized, distributed digital ledger technology that records transactions across many computers to ensure data security and transparency.


3.1H Big Data and Data Analytics

Volume: The scale and quantity of data generated, collected, and stored in digital environments.

Variety: The diversity of data types and sources available across digital platforms.

Velocity: The speed at which digital data is generated, processed, and analyzed in real-time.

Veracity: The accuracy, reliability, and trustworthiness of digital data and its sources.

Predictive Analysis: The use of historical and current digital data to forecast future events, behaviors, or trends.

Modeling: The creation of digital representations or simulations of real-world systems based on data.

Behavioral Understanding: The analysis of digital data to comprehend past, present, and potential future human actions and decisions.


3.1I Data Dilemmas

Data Bias: Systematic errors in digital data collection or analysis that lead to unfair advantages or disadvantages for certain groups.

Data Reliability: The consistency and dependability of digital data to accurately represent what it claims to measure.

Data Integrity: The maintenance of data accuracy, completeness, and consistency throughout its life cycle in digital systems.

Data Control: The ability to determine how digital data is used, accessed, and shared across platforms.

Data Ownership: The legal rights and responsibilities associated with possessing and using digital data.

Data Access: The ability to retrieve and use digital data based on permissions and technical capabilities.

Data Privacy: The protection of digital information from unauthorized access and the right of individuals to control their personal data.

Anonymity: The state of being unknown or unidentifiable in digital environments.

Surveillance: The monitoring of digital behaviors, communications, or activities, often through automated systems.

Personally Identifiable Information (PII): Data that can be used to identify, contact, or locate a specific individual in digital contexts.



Key Terms in Digital Society: Data Concepts and Management

3.1A Data as distinct from information, knowledge and wisdom

Definition: The DIKW (Data, Information, Knowledge, Wisdom) pyramid represents how digital technology transforms raw data into actionable wisdom through increasing context, meaning, and application in digital societies.


Examples:

  • Data: Raw digital signals collected by sensors (e.g., GPS coordinates "37.7749, -122.4194")

  • Information: Data interpreted with context (e.g., "You are in San Francisco, California")

  • Knowledge: Patterns recognized from information (e.g., "Based on your location history, you typically visit San Francisco on weekends")

  • Wisdom: Insights applied for decision-making (e.g., "Considering your travel patterns and preferences, here are personalized recommendations for your weekend in San Francisco")


3.1B Types of data in digital contexts

Definition: Categories of data that are collected, processed, and utilized in digital environments and platforms.


Examples:

  • (a) Quantitative and qualitative

    • Quantitative: Website engagement metrics, social media follower counts, streaming service viewing times

    • Qualitative: User reviews, social media comments, interview transcripts from digital focus groups

  • (b) Cultural, financial, geographical, medical, meteorological, transport, scientific, statistical

    • Cultural: Spotify listening patterns, Netflix viewing preferences, digital art metadata

    • Financial: Cryptocurrency transactions, mobile payment records, algorithmic trading data

    • Geographical: Location data from smartphones, digital mapping information, geotargeted advertising data

    • Medical: Electronic health records, wearable device biometric data, telemedicine consultation logs

    • Meteorological: Digital weather station readings, satellite imagery of climate patterns

    • Transport: Ride-sharing app data, public transit smart card usage, autonomous vehicle sensor feeds

    • Scientific: Digital telescope readings, computational biology simulations, digital genomic sequences

    • Statistical: Online survey results, digital census responses, web analytics metrics

  • (c) Metadata

    • Social media post timestamps, engagement metrics, and audience demographics

    • Digital photo EXIF data revealing device information, location, and editing history

    • Browser cookies tracking website usage patterns, preferences, and user journeys


3.1C Uses of data in digital society

Definition: Applications of digital data to understand and influence social, economic, and political aspects of digitally connected communities.


Examples:

  • (a) Identify trends, patterns, connections and relationships

    • Social media sentiment analysis to gauge public opinion on political issues

    • Spotting emerging viral content patterns across digital platforms

    • Mapping connections between digital influencers and their audience demographics

    • Detecting correlations between online behavior and consumer purchasing decisions

  • (b) Collect and organize measurable facts about people and communities

    • Digital footprints across social media platforms revealing lifestyle preferences

    • Smart city sensors monitoring traffic patterns, air quality, and energy usage

    • Digital divide metrics showing disparities in internet access across different communities

    • Online educational platform analytics tracking learning outcomes and engagement


3.1D Data life cycle in digital environments

Definition: The journey of digital data from creation through various stages of processing, use, and eventual archiving or deletion.


Examples:

  • Create/collect/extract: Social media API harvesting, IoT sensor networks, online form submissions, web scraping

  • Store: Cloud-based databases, distributed storage systems, blockchain ledgers, digital archives

  • Process: Algorithmic data cleaning, machine learning preprocessing, digital signal processing

  • Analyze: AI-powered pattern recognition, big data analytics platforms, network analysis of digital relationships

  • Access: Mobile app interfaces, data visualization dashboards, voice-activated digital assistants

  • Preserve: Digital archiving initiatives, time-stamped blockchain records, version control systems

  • Reuse: Open data repositories, API access to public datasets, digital data marketplaces


3.1E Ways to collect and organize data in digital systems

Definition: Digital methods and technologies used to gather, structure, and maintain data across platforms and applications.


Examples:

  • (a) Primary and secondary data collection

    • Primary: Mobile app usage tracking, online surveys, digital ethnography, social media monitoring

    • Secondary: Open data portals, digital archives, commercial data marketplaces, academic research repositories

  • (b) Databases organize and structure collections of data

    • Cloud-based relational databases powering e-commerce platforms

    • Graph databases mapping social networks and digital relationships

    • Time-series databases tracking IoT sensor data from smart homes

    • Distributed ledgers maintaining cryptocurrency transaction history

  • (c) Data classifications and relationships

    • Digital content taxonomies for streaming platforms' recommendation systems

    • User-generated tagging systems on social media platforms

    • Semantic web ontologies enabling machine-readable relationships between digital concepts

    • Digital identity relationships between accounts, devices, and online behaviors


3.1F Ways of representing data in digital formats

Definition: Digital visualization and presentation methods that transform complex datasets into comprehensible formats for human understanding.


Examples:

  • Charts: Interactive COVID-19 case tracking dashboards, real-time stock market visualizations

  • Tables: Dynamic spreadsheets of digital advertising campaign performance, sortable online leaderboards

  • Reports: Automated digital analytics summaries, AI-generated business intelligence briefings

  • Infographics: Interactive visualizations of global internet usage, animated climate change data stories

  • Visualizations: Virtual reality data environments, augmented reality overlays of urban information, interactive network graphs of social connections


3.1G Data security in digital environments

Definition: Technological measures implemented to protect digital data from unauthorized access, corruption, or theft in interconnected systems.


Examples:

  • (a) Encryption, data masking, data erasure

    • Encryption: End-to-end encrypted messaging apps, VPN services, encrypted cloud storage

    • Data masking: Social security number obfuscation in digital forms, tokenization of payment information

    • Data erasure: GDPR-compliant account deletion processes, secure digital device wiping services

  • (b) Blockchain

    • NFT (non-fungible token) verification of digital art ownership

    • Decentralized autonomous organizations (DAOs) governing online communities

    • Self-sovereign identity systems giving users control over their digital credentials

    • Transparent supply chain tracking for ethical product sourcing verification


3.1H Characteristics and uses of big data and data analytics in digital society

Definition: Properties and applications of massive digital datasets that reveal patterns about human behavior, social trends, and digital interactions.


Examples:

  • (a) Characteristics: volume, variety, velocity, veracity

    • Volume: Exabytes of user-generated content across social media platforms

    • Variety: Multimedia digital footprints combining text, images, video, location, and relational data

    • Velocity: Real-time processing of billions of global internet transactions per second

    • Veracity: Authentication systems for digital content to combat deepfakes and misinformation

  • (b) Uses: Predictive analysis, modelling, understanding behavior

    • Predictive analysis: Algorithm-based content recommendation systems anticipating user preferences

    • Modelling: Digital twins of urban environments simulating traffic, pollution, and resource usage

    • Understanding past behavior: Analyzing historical trends in online political discourse

    • Understanding current behavior: Real-time monitoring of global pandemic information seeking

    • Understanding future behavior: Predicting viral content spread through social network simulation


3.1I Data dilemmas in digital society

Definition: Ethical, social, and political challenges arising from pervasive data collection and use in digitally connected communities.


Examples:

  • (a) Data bias, reliability and integrity

    • Bias: Algorithmic discrimination in automated hiring systems or facial recognition technologies

    • Reliability: False information spreading through social media recommendation algorithms

    • Integrity: Deep fake videos manipulating digital evidence or distorting public discourse

  • (b) Control, ownership and access to data

    • Control: Platform governance decisions about content moderation and algorithmic transparency

    • Ownership: Digital rights management systems restricting content sharing versus open access movements

    • Access: Digital divides creating inequalities in who can benefit from data-driven services

  • (c) Data privacy, anonymity and surveillance

    • Privacy: Pervasive tracking across digital platforms creating detailed behavioral profiles

    • Anonymity: Blockchain-based services enabling pseudonymous participation in digital economies

    • Surveillance: Facial recognition in public spaces, digital monitoring of worker productivity

    • Personally identifiable information: Biometric data from wearable devices, cross-platform identity linking, persistent digital identifiers like device IDs


Examination Questions for IB Digital Society: Data Concepts

3.1A Data as distinct from information, knowledge and wisdom

  1. Define the DIKW pyramid as it relates to digital society.

  2. Describe how raw data transforms into wisdom in the context of a social media platform.

  3. Outline the key differences between data and information in digital environments.

  4. State two examples of how digital technology converts data into knowledge.


3.1B Types of data in digital contexts

  1. Identify three types of qualitative data commonly collected through digital platforms.

  2. List four examples of metadata generated by smartphone applications.

  3. Define the difference between quantitative and qualitative data in digital contexts.

  4. Describe how cultural data is collected and used in streaming services.

  5. Outline the various types of geographical data utilized in modern navigation applications.


3.1C Uses of data in digital society

  1. State two ways social media platforms identify trends from user data.

  2. Describe how e-commerce websites use data to identify relationships between products.

  3. Outline the methods used to collect measurable facts about digital communities.

  4. Define how pattern recognition in digital data influences business decision-making.


3.1D Data life cycle in digital environments

  1. List the seven stages of the data life cycle in digital systems.

  2. Describe the process of data preservation in blockchain technologies.

  3. Outline how the collection and storage stages differ in mobile apps versus IoT devices.

  4. State two challenges associated with the reuse phase of the data life cycle.


3.1E Ways to collect and organize data in digital systems

  1. Define primary and secondary data collection in the context of digital research.

  2. Identify three characteristics of effective database organization in digital platforms.

  3. Describe how cloud-based databases structure and manage user information.

  4. Outline the different data classification systems used in digital content management.


3.1F Ways of representing data in digital formats

  1. List four digital formats used to represent complex datasets visually.

  2. State the advantages of interactive visualizations over static reports in digital contexts.

  3. Describe how infographics are used to communicate digital society trends.

  4. Identify two examples of data visualization techniques specific to social network analysis.


3.1G Data security in digital environments

  1. Define encryption as it applies to digital communication.

  2. Outline three methods of data masking used to protect personally identifiable information online.

  3. Describe how blockchain technology ensures data security in digital transactions.

  4. State two approaches to secure data erasure on digital devices.


3.1H Characteristics and uses of big data and data analytics in digital society

  1. List the four main characteristics of big data (the four Vs).

  2. Describe how velocity affects real-time data processing in social media platforms.

  3. Outline how predictive analysis is applied in digital content recommendation systems.

  4. Define digital twins and explain their role in modeling complex systems.

  5. Identify three examples of how big data analytics helps understand current human behavior online.


3.1I Data dilemmas in digital society

  1. State two examples of algorithmic bias in digital technologies.

  2. Describe the ethical concerns surrounding data ownership in social media platforms.

  3. Define personally identifiable information in the context of digital privacy.

  4. Outline the tension between digital surveillance and personal privacy in smart cities.

  5. List three challenges to maintaining data integrity in user-generated content platforms.








Comments


  • Instagram
  • Youtube
  • X

2024 IBDP DIGITAL SOCIETY | LUKE WATSON TEACH

bottom of page