1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
|
[[Real_time_Reverse_Mapping_Btree]]
=== Reverse-Mapping B+tree
If the reverse-mapping B+tree and real-time storage device features
are enabled, each real-time group has its own reverse block-mapping
B+tree.
As mentioned in the chapter about xref:Reconstruction[reconstruction],
this data structure is another piece of the puzzle necessary to
reconstruct the data or attribute fork of a file from reverse-mapping
records; we can also use it to double-check allocations to ensure that
we are not accidentally cross-linking blocks, which can cause severe
damage to the filesystem.
This B+tree is only present if the +XFS_SB_FEAT_RO_COMPAT_RMAPBT+
feature is enabled and a real time device is present. The feature
requires a version 5 filesystem.
The rtgroup reverse mapping B+tree is rooted in an inode's data fork; the inode
number can be found by resolving the path +/rtgroups/$rgno.rmap+ in the
metadata directory tree. The B+tree blocks themselves are stored on the data
volume. The structures used for an inode's B+tree root are:
[source, c]
----
struct xfs_rtrmap_root {
__be16 bb_level;
__be16 bb_numrecs;
};
----
* If the B+tree contains only a single level, the ondisk data fork area begins
with a +xfs_rtrmap_root+ header followed by an array of +xfs_rmap_rec+ leaf
records.
* Otherwise, the ondisk data fork area begins with the +xfs_rtrmap_root+
header and is followed first by an array of doubled up +xfs_rmap_key+ values
and then an array of +xfs_rtrmap_ptr_t+ values. The size of both arrays is
specified by the header's +bb_numrecs+ value.
* The root node in the inode can only contain up to 14 leaf records or 7
key/pointer pairs for a standard 512 byte inode before a new level of nodes is
added between the root and the leaves.
Each record in an rtgroup reverse-mapping B+tree has the same structure as an
AG reverse mapping btree:
[source, c]
----
struct xfs_rmap_rec {
__be32 rm_startblock;
__be32 rm_blockcount;
__be64 rm_owner;
__be64 rm_fork:1;
__be64 rm_bmbt:1;
__be64 rm_unwritten:1;
__be64 rm_offset:61;
};
----
*rm_startblock*::
rtgroup block number of this record.
*rm_blockcount*::
The length of this extent, in rt blocks.
*rm_owner*::
A 64-bit number describing the owner of this extent. This must be
+XFS_RMAP_OWN_FS+ for the first extent in the realtime group zero if realtime
superblocks are enabled. For all other records, it must be an inode number,
because the real-time volume does not store any other metadata.
*rm_fork*::
This value will always be zero.
*rm_bmbt*::
This value will always be zero.
*rm_unwritten*::
A flag indicating that the extent is unwritten. This corresponds to
the flag in the xref:Data_Extents[extent record] format which means
+XFS_EXT_UNWRITTEN+.
*rm_offset*::
The 61-bit logical file block offset, if +rm_owner+ describes an
inode.
[NOTE]
The single-bit flag values +rm_unwritten+, +rm_fork+, and +rm_bmbt+
are packed into the larger fields in the C structure definition.
The key has the following structure:
[source, c]
----
struct xfs_rmap_key {
__be32 rm_startblock;
__be64 rm_owner;
__be64 rm_fork:1;
__be64 rm_bmbt:1;
__be64 rm_reserved:1;
__be64 rm_offset:61;
};
----
* All block numbers in records and keys are 32-bit real-time group block
numbers.
* The +bb_magic+ value is ``MAPR'' (0x4d415052).
* The +struct xfs_btree_lblock+ header is used for intermediate B+tree node as
well as the leaves.
* Each pointer is associated with two keys. The first of these is the
"low key", which is the key of the smallest record accessible through
the pointer. This low key has the same meaning as the key in all
other btrees. The second key is the high key, which is the maximum of
the largest key that can be used to access a given record underneath
the pointer. Recall that each record in the rtgroup reverse mapping
b+tree describes an interval of physical blocks mapped to an interval
of logical file block offsets; therefore, it makes sense that a range
of keys can be used to find to a record.
==== xfs_db rtrmapbt Example
This example shows a real-time reverse-mapping B+tree from a freshly
populated root filesystem:
----
xfs_db> path -m /rtgroups/0.rmap
xfs_db> p
core.magic = 0x494e
core.mode = 0100000
core.version = 3
core.format = 5 (rmap)
...
u3.rtrmapbt.level = 1
u3.rtrmapbt.numrecs = 3
u3.rtrmapbt.keys[1-3] = [startblock,owner,offset,attrfork,bmbtblock,
startblock_hi,owner_hi,offset_hi,attrfork_hi,
bmbtblock_hi]
1:[0,-3,0,0,0,682,10015,681,0,0]
2:[228,10014,227,0,0,454,10014,453,0,0]
3:[456,10014,455,0,0,682,10014,681,0,0]
u3.rtrmapbt.ptrs[1-3] = 1:10 2:11 3:12
---
This is a two-level tree, so we should follow it towards the leaves.
---
xfs_db> addr u3.rtrmapbt.ptrs[1]
xfs_db> p
magic = 0x4d415052
level = 0
numrecs = 115
leftsib = null
rightsib = 11
bno = 80
lsn = 0
uuid = 23d157a4-8ca7-4fca-8782-637dc6746105
owner = 133
crc = 0x4c046e7d (correct)
recs[1-115] = [startblock,blockcount,owner,offset,extentflag,attrfork,bmbtblock]
1:[0,1,-3,0,0,0,0]
2:[1,682,10015,0,0,0,0]
3:[2,1,10014,1,0,0,0]
4:[4,1,10014,3,0,0,0]
5:[6,1,10014,5,0,0,0]
6:[8,1,10014,7,0,0,0]
7:[10,1,10014,9,0,0,0]
8:[12,1,10014,11,0,0,0]
9:[14,1,10014,13,0,0,0]
...
112:[220,1,10014,219,0,0,0]
113:[222,1,10014,221,0,0,0]
114:[224,1,10014,223,0,0,0]
115:[226,1,10014,225,0,0,0]
----
Several interesting things pop out here. The first record shows that inode
10,014 has mapped real-time block 225 at offset 225. We confirm this by
looking at the block map for that inode:
----
xfs_db> inode 10014
xfs_db> p core.realtime
core.realtime = 1
xfs_db> bmap 220 10
data offset 221 startblock 222 (0/222) count 1 flag 0
data offset 223 startblock 224 (0/224) count 1 flag 0
data offset 225 startblock 226 (0/226) count 1 flag 0
data offset 227 startblock 228 (0/228) count 1 flag 0
data offset 229 startblock 230 (0/230) count 1 flag 0
----
Notice that inode 10,014 has the real-time flag set, which means that its
data blocks are all allocated from the real-time device.
|