Bitcoin
Base Tables
The Bitcoin dataset's core structure is based on four primary tables: blocks
, transactions
, and inputs
/outputs
. Each table captures essential details that collectively represent the Bitcoin blockchain's transactional data, enabling deep insights into block-level, transaction-level, and input-output dynamics. The blocks
table records metadata for each mined block, such as block height, hash, mining difficulty, and reward details, creating a foundation for chronological and structural analysis of the blockchain. The transactions
table logs every individual transaction within these blocks, including transaction size, fees, and input/output counts, which facilitates the tracking of fund movements and transactional patterns within each block.
The inputs
and outputs
tables expand on the granular details of transactions, offering insight into each transaction's source and destination of funds. The inputs
table stores data about funds being consumed, including addresses, value, and unlocking scripts, while the outputs
table logs the destination addresses, output value, and locking scripts. Together, these tables provide a complete view of how value is transferred across the Bitcoin network, supporting analytical queries that range from tracing individual transactions to identifying complex patterns in fund distribution, fee structures, and network activity. This setup enables comprehensive blockchain data exploration for both high-level trends and low-level transaction details.
Blocks
The blocks
table stores detailed information about each block in the Bitcoin blockchain. It includes metadata such as block height, timestamp, transaction details, and mining difficulty.
Schema Overview
height
INT
Primary key. The block height represents the number of blocks preceding this block in the blockchain.
bits
VARCHAR(10)
Encodes the target threshold for the block's cryptographic hash, adjusting the network's mining difficulty.
chainwork
VARCHAR(64)
A cumulative measure of the proof-of-work effort that has gone into the chain up to this block.
coinbase
VARCHAR(200)
Information about the coinbase transaction, which rewards miners.
date
DATE
The date the block was mined.
difficulty
DECIMAL(38, 0)
The difficulty target for mining this block, indicating how hard it is to generate a valid hash.
hash
VARCHAR(64)
The unique identifier for this block, generated from the block's header.
merkle_root
VARCHAR(64)
The root hash of the Merkle tree of transactions in this block, ensuring data integrity.
mint_reward
DOUBLE
The reward for successfully mining this block, typically in Bitcoin.
nonce
VARCHAR(10)
A variable incremented by miners to find a valid hash under the target threshold.
previous_block_hash
VARCHAR(64)
The hash of the preceding block in the chain, linking blocks together.
size
INT
The size of the block in bytes.
stripped_size
INT
The block size without witness data, as used in SegWit.
time
DATETIME
The exact timestamp when this block was mined.
total_fees
DOUBLE
The total fees from transactions in this block, collected by the miner.
total_reward
DOUBLE
The total reward, combining the mint reward and transaction fees.
transaction_count
INT
The number of transactions included in this block.
weight
INT
The block weight, which is used in SegWit to ensure block size limits.
Transactions
The transactions
table provides comprehensive data on individual transactions recorded on the Bitcoin blockchain. Each transaction is linked to a block by its height and hash, with additional fields capturing details about inputs, outputs, fees, and metadata.
Schema Overview
id
VARCHAR(64)
Primary key. Unique identifier for the transaction, known as the transaction hash.
block_height
INT
Height of the block containing this transaction, linking it to the blocks
table.
block_date
DATE
Date when the block was mined.
size
INT
Size of the transaction in bytes.
virtual_size
INT
Virtual size in vbytes, used to calculate fees in SegWit-enabled blocks.
version
SMALLINT
Version of the transaction, identifying the Bitcoin protocol rules followed.
lock_time
INT
Time or block height at which this transaction becomes valid for inclusion in a block.
block_hash
VARCHAR(64)
Hash of the block containing this transaction.
block_time
DATETIME
Exact timestamp when the block containing this transaction was mined.
index
BIGINT
The index position of the transaction within the block.
input_count
SMALLINT
Number of input records in this transaction.
output_count
SMALLINT
Number of output records in this transaction.
input_value
BIGINT
Total value of inputs in satoshis.
output_value
BIGINT
Total value of outputs in satoshis.
is_coinbase
BOOLEAN
Indicates if this is a coinbase transaction (i.e., the first transaction in a block rewarding miners).
coinbase
TEXT
Details about the coinbase transaction, including reward.
fee
BIGINT
Total transaction fee in satoshis, calculated as the difference between input_value
and output_value
.
hex
TEXT
Raw HEX data for the transaction, representing the entire transaction as a hexadecimal string.
inputs
JSON
JSON array of input details, capturing previous outputs spent in this transaction.
outputs
JSON
JSON array of output details, showing where the transaction funds were sent.
Indexing and Query Optimization
Indexes: The table includes bitmap indexes on
block_time
,block_date
, andis_coinbase
to optimize queries by transaction date, mining status, and block timestamp.Engine: OLAP, designed for high-performance analytical queries.
Partitioning: By
block_height
range, enabling efficient retrieval of transactions over specific block ranges.
Inputs
The inputs
table stores information about each input in Bitcoin transactions. Inputs represent the source of funds spent in a transaction, linking to previous transaction outputs.
Schema Overview
transaction_id
VARCHAR(64)
Primary key. Hash of the transaction containing this input.
block_height
INT
Height of the block containing this transaction, linking it to the blocks
table.
block_date
DATE
Date when the block was mined.
index
INT
Index of this input within the transaction, uniquely identifying it.
address
VARCHAR(64)
Bitcoin address associated with this input, if available.
required_signatures
INT
Number of signatures required to authorize spending of this input.
script_asm
TEXT
The input's script in human-readable assembly language, defining the unlocking conditions.
script_hex
TEXT
The input script in raw hexadecimal format, enabling parsing of the unlocking script.
sequence
BIGINT
Sequence number used for ordering inputs or setting locktime behavior.
spent_output_index
INT
The index of the output being spent in the referenced transaction.
spent_transaction_hash
VARCHAR(64)
Hash of the transaction from which this input derives its funds.
type
VARCHAR(40)
Type of input, e.g., P2PKH
, P2SH
, identifying the script type.
value
BIGINT
Amount of Bitcoin in satoshis provided by this input.
Outputs
The outputs
table captures information about each output in Bitcoin transactions. Outputs represent the destination of funds within a transaction, detailing the recipient address, value, and conditions for spending.
Schema Overview
transaction_id
VARCHAR(64)
Primary key. Hash of the transaction containing this output.
block_height
INT
Height of the block containing this transaction, linking it to the blocks
table.
block_date
DATE
Date when the block was mined.
index
INT
Index of this output within the transaction, uniquely identifying it within the transaction.
address
VARCHAR(64)
Bitcoin address associated with this output, if applicable.
value
BIGINT
Value of Bitcoin in satoshis represented by this output.
script_hex
TEXT
The output script (ScriptPubKey) in hexadecimal format, defining the conditions for spending this output.
script_asm
TEXT
The output script in assembly language, providing a human-readable version of the ScriptPubKey.
type
VARCHAR(40)
Type of output script, such as P2PKH
, P2SH
, identifying the type of locking conditions.
required_signatures
INT
Number of signatures required to unlock this output for spending.
Data Audit
Blocks vs Transactions Count
View: audit_blocks_tx_count
audit_blocks_tx_count
Description: This view provides an audit report of discrepancies in the transaction counts recorded within the Bitcoin blocks. Specifically, it identifies blocks where the declared transaction_count
in the blocks
table does not match the actual count of transactions recorded in the transactions
table. This audit view is essential for validating data integrity, ensuring consistency between block metadata and actual transactional records.
Schema Overview
block_height
INT
The height of the block with a potential discrepancy in transaction count.
declared_transaction_count
INT
The transaction count as declared in the blocks
table for this block height.
actual_transaction_count
INT
The actual count of transactions found in the transactions
table for this block height.
Missing Blocks
View: audit_missing_blocks
audit_missing_blocks
Description: This view identifies and lists gaps in block heights within the Bitcoin blockchain, highlighting missing blocks in the sequence. By reporting these gaps, the view enables quick identification of potentially missing or unrecorded blocks, which can be crucial for data consistency and completeness checks.
Schema Overview
gap_start
INT
The height at which the missing block sequence begins.
gap_end
INT
The height at which the missing block sequence ends.
gap_size
INT
The number of missing blocks in this gap, calculated as the difference between start and end.
L2
Addresses
Daily Active Addresses
Last updated