Bitcoin

Base Tables

The Bitcoin dataset's core structure is based on four primary tables: blocks, transactions, and inputs/outputs. Each table captures essential details that collectively represent the Bitcoin blockchain's transactional data, enabling deep insights into block-level, transaction-level, and input-output dynamics. The blocks table records metadata for each mined block, such as block height, hash, mining difficulty, and reward details, creating a foundation for chronological and structural analysis of the blockchain. The transactions table logs every individual transaction within these blocks, including transaction size, fees, and input/output counts, which facilitates the tracking of fund movements and transactional patterns within each block.

The inputs and outputs tables expand on the granular details of transactions, offering insight into each transaction's source and destination of funds. The inputs table stores data about funds being consumed, including addresses, value, and unlocking scripts, while the outputs table logs the destination addresses, output value, and locking scripts. Together, these tables provide a complete view of how value is transferred across the Bitcoin network, supporting analytical queries that range from tracing individual transactions to identifying complex patterns in fund distribution, fee structures, and network activity. This setup enables comprehensive blockchain data exploration for both high-level trends and low-level transaction details.

Blocks

The blocks table stores detailed information about each block in the Bitcoin blockchain. It includes metadata such as block height, timestamp, transaction details, and mining difficulty.

Schema Overview

Column
Data Type
Description

height

INT

Primary key. The block height represents the number of blocks preceding this block in the blockchain.

bits

VARCHAR(10)

Encodes the target threshold for the block's cryptographic hash, adjusting the network's mining difficulty.

chainwork

VARCHAR(64)

A cumulative measure of the proof-of-work effort that has gone into the chain up to this block.

coinbase

VARCHAR(200)

Information about the coinbase transaction, which rewards miners.

date

DATE

The date the block was mined.

difficulty

DECIMAL(38, 0)

The difficulty target for mining this block, indicating how hard it is to generate a valid hash.

hash

VARCHAR(64)

The unique identifier for this block, generated from the block's header.

merkle_root

VARCHAR(64)

The root hash of the Merkle tree of transactions in this block, ensuring data integrity.

mint_reward

DOUBLE

The reward for successfully mining this block, typically in Bitcoin.

nonce

VARCHAR(10)

A variable incremented by miners to find a valid hash under the target threshold.

previous_block_hash

VARCHAR(64)

The hash of the preceding block in the chain, linking blocks together.

size

INT

The size of the block in bytes.

stripped_size

INT

The block size without witness data, as used in SegWit.

time

DATETIME

The exact timestamp when this block was mined.

total_fees

DOUBLE

The total fees from transactions in this block, collected by the miner.

total_reward

DOUBLE

The total reward, combining the mint reward and transaction fees.

transaction_count

INT

The number of transactions included in this block.

weight

INT

The block weight, which is used in SegWit to ensure block size limits.

Transactions

The transactions table provides comprehensive data on individual transactions recorded on the Bitcoin blockchain. Each transaction is linked to a block by its height and hash, with additional fields capturing details about inputs, outputs, fees, and metadata.

Schema Overview

Column
Data Type
Description

id

VARCHAR(64)

Primary key. Unique identifier for the transaction, known as the transaction hash.

block_height

INT

Height of the block containing this transaction, linking it to the blocks table.

block_date

DATE

Date when the block was mined.

size

INT

Size of the transaction in bytes.

virtual_size

INT

Virtual size in vbytes, used to calculate fees in SegWit-enabled blocks.

version

SMALLINT

Version of the transaction, identifying the Bitcoin protocol rules followed.

lock_time

INT

Time or block height at which this transaction becomes valid for inclusion in a block.

block_hash

VARCHAR(64)

Hash of the block containing this transaction.

block_time

DATETIME

Exact timestamp when the block containing this transaction was mined.

index

BIGINT

The index position of the transaction within the block.

input_count

SMALLINT

Number of input records in this transaction.

output_count

SMALLINT

Number of output records in this transaction.

input_value

BIGINT

Total value of inputs in satoshis.

output_value

BIGINT

Total value of outputs in satoshis.

is_coinbase

BOOLEAN

Indicates if this is a coinbase transaction (i.e., the first transaction in a block rewarding miners).

coinbase

TEXT

Details about the coinbase transaction, including reward.

fee

BIGINT

Total transaction fee in satoshis, calculated as the difference between input_value and output_value.

hex

TEXT

Raw HEX data for the transaction, representing the entire transaction as a hexadecimal string.

inputs

JSON

JSON array of input details, capturing previous outputs spent in this transaction.

outputs

JSON

JSON array of output details, showing where the transaction funds were sent.

Indexing and Query Optimization

  • Indexes: The table includes bitmap indexes on block_time, block_date, and is_coinbase to optimize queries by transaction date, mining status, and block timestamp.

  • Engine: OLAP, designed for high-performance analytical queries.

  • Partitioning: By block_height range, enabling efficient retrieval of transactions over specific block ranges.

Inputs

The inputs table stores information about each input in Bitcoin transactions. Inputs represent the source of funds spent in a transaction, linking to previous transaction outputs.

Schema Overview

Column
Data Type
Description

transaction_id

VARCHAR(64)

Primary key. Hash of the transaction containing this input.

block_height

INT

Height of the block containing this transaction, linking it to the blocks table.

block_date

DATE

Date when the block was mined.

index

INT

Index of this input within the transaction, uniquely identifying it.

address

VARCHAR(64)

Bitcoin address associated with this input, if available.

required_signatures

INT

Number of signatures required to authorize spending of this input.

script_asm

TEXT

The input's script in human-readable assembly language, defining the unlocking conditions.

script_hex

TEXT

The input script in raw hexadecimal format, enabling parsing of the unlocking script.

sequence

BIGINT

Sequence number used for ordering inputs or setting locktime behavior.

spent_output_index

INT

The index of the output being spent in the referenced transaction.

spent_transaction_hash

VARCHAR(64)

Hash of the transaction from which this input derives its funds.

type

VARCHAR(40)

Type of input, e.g., P2PKH, P2SH, identifying the script type.

value

BIGINT

Amount of Bitcoin in satoshis provided by this input.

Outputs

The outputs table captures information about each output in Bitcoin transactions. Outputs represent the destination of funds within a transaction, detailing the recipient address, value, and conditions for spending.

Schema Overview

Column
Data Type
Description

transaction_id

VARCHAR(64)

Primary key. Hash of the transaction containing this output.

block_height

INT

Height of the block containing this transaction, linking it to the blocks table.

block_date

DATE

Date when the block was mined.

index

INT

Index of this output within the transaction, uniquely identifying it within the transaction.

address

VARCHAR(64)

Bitcoin address associated with this output, if applicable.

value

BIGINT

Value of Bitcoin in satoshis represented by this output.

script_hex

TEXT

The output script (ScriptPubKey) in hexadecimal format, defining the conditions for spending this output.

script_asm

TEXT

The output script in assembly language, providing a human-readable version of the ScriptPubKey.

type

VARCHAR(40)

Type of output script, such as P2PKH, P2SH, identifying the type of locking conditions.

required_signatures

INT

Number of signatures required to unlock this output for spending.

Data Audit

Blocks vs Transactions Count

View: audit_blocks_tx_count

Description: This view provides an audit report of discrepancies in the transaction counts recorded within the Bitcoin blocks. Specifically, it identifies blocks where the declared transaction_count in the blocks table does not match the actual count of transactions recorded in the transactions table. This audit view is essential for validating data integrity, ensuring consistency between block metadata and actual transactional records.

Schema Overview

Column
Data Type
Description

block_height

INT

The height of the block with a potential discrepancy in transaction count.

declared_transaction_count

INT

The transaction count as declared in the blocks table for this block height.

actual_transaction_count

INT

The actual count of transactions found in the transactions table for this block height.

Missing Blocks

View: audit_missing_blocks

Description: This view identifies and lists gaps in block heights within the Bitcoin blockchain, highlighting missing blocks in the sequence. By reporting these gaps, the view enables quick identification of potentially missing or unrecorded blocks, which can be crucial for data consistency and completeness checks.

Schema Overview

Column
Data Type
Description

gap_start

INT

The height at which the missing block sequence begins.

gap_end

INT

The height at which the missing block sequence ends.

gap_size

INT

The number of missing blocks in this gap, calculated as the difference between start and end.

L2

Addresses

  • Daily Active Addresses

Last updated