Coverage report: /development/source/library/org/datagraph/spocq-shard/src/store/rlmdb/repository-streaming.lisp
| Kind | Covered | All | % |
| expression | 335 | 709 | 47.2 |
| branch | 13 | 40 | 32.5 |
Key
Not instrumented
Conditionalized out
Executed
Not executed
Both branches taken
One branch taken
Neither branch taken
1
;;; -*- Mode: lisp; Syntax: ansi-common-lisp; Base: 10; Package: org.datagraph.spocq.implementation; -*-
3
(in-package :org.datagraph.rdf.lmdb.implementation)
5
(:documentation "LMDB repository multi-revision streaming
6
LMDB-based bgp processing relies on this as the quad source."
8
"The streaming operators provide an interface to a repository
9
which supports transactions which cover temporal ranges, as
10
well as individual bgps which are governed by a temporal expression.
12
At this level, the ranges are
13
- single revisions: as ordinals or temporal-id (uuid)
14
- single intervals [closed, closed]
15
These are used by the query invocation mechanism to implement
18
Temporal expressions take the form of logical combinations of Allen relations.
19
They are established by extracting and rewriting filter forms present in a bgp
20
into a predicate to be applied to the visibility attributes of the matched statements.
22
There are two match/scan operators:
23
- match-repository-statements : accepts ordinal ranges
24
- match-replicated-repository-statements : accepts temporal-id ranges
26
the match/scan operation is modified such that the transaction properties, the
27
uuid, ordinal and timestamp, are available during its dynamic context
28
the results can appear in one of two orders:
29
- by revision pattern: collated by matched revision, ordered by quad, and
31
- by quad pattern: matched quad and ordered by revision
33
in both cases duplicates may appear either when they appear in distinct revision
34
graphs or when a quad is visible in multiple windowed revision windows.
35
in the simple streaming case, within a single window, the spog* traversal yields
36
no duplicates are the index deduplicates.
38
in order to stream time-series data, an alternative to spog* indices locates
39
graphs by revision ordinal. wwhile qpog* indices are based on term number quads
40
and the timestamps are not in-lined, the ordinal->graph index maps the
41
revision directly to the quad set.
43
the revision metadata includes the transaction time, as a timestamp, but they are available only
44
dependent on the ordinals: one must search ordinals as a sequence to contrain a min/max according to
45
a temporal windows. after this, the ordinals bounds are used to constrain the quad visibility.
47
for bitemporal streaming, the application time is not in the same value domain
48
as the transaction time: instead of a quad index, the index is 5-d to integrate
49
the time into the graph and sort order is by function rather than position
51
alternative implementations:
52
graphdb : http://graphdb.ontotext.com/documentation/free/data-history-and-versioning.html
53
retains the diff stream
55
allegrograph : https://franz.com/agraph/support/documentation/current/text-index.html
56
included just cl example, but no sparql
57
the solr variant included sparql, but was not clear how the query was expressed and interpreted.
61
(:documentation "quad matching"
62
"map-repository-statements :
63
Supports two matching processes
64
- match by quad pattern, filter by visibility:
66
- set up a cursor for the index,
67
- scan until the next result does not match,
68
- continue for each quad which matches the revision ordinal
69
- match by revision ordinal
70
- filter by quad pattern
73
(defgeneric rlmdb:map-repository-statements (operator repository quad-pattern
80
"Given a repository, determine which index to use based on the pattern
81
and the available indices and delegate to its method.
82
If the pattern is wild and revisions are specified and the repository
83
includes a revision->graph database, then use that. Otherwise the pattern
84
specificity determines a spog* index.")
86
(:method (operator (repository rlmdb:repository) quad-pattern &rest args)
87
"The base method for a repository just returns nil to indicate no index applied"
88
(declare (ignore args))
91
(:method (operator (id string) quad-pattern &rest args)
92
"given a string operate on the dydra repository"
93
(declare (dynamic-extent args))
94
(apply #'rlmdb:map-repository-statements operator (spocq.i:repository id) quad-pattern args))
96
(:method (operator (transaction spocq.i::lmdb-transaction) (quad-pattern t)
98
"Given an api transaction, delegate to its revision"
99
(apply #'rlmdb:map-repository-statements operator (transaction-revision transaction)
103
(:method (operator (revision spocq.i::lmdb-revision) (quad-pattern t)
104
&rest args &key revision-predicate
105
domain-predicate scan-order)
106
"Given an api revision, capture its ordinal bounds use them to operate on the storage"
107
(declare (ignore domain-predicate scan-order))
108
(apply #'rlmdb:map-repository-statements operator (repository-lmdb-repository revision)
110
:revision-predicate (or revision-predicate
111
(compute-revision-predicate (list :first (spocq.i::revision-min-revision-ordinal revision)
112
:last (spocq.i::revision-max-revision-ordinal revision))))
115
(:method (operator (repository spocq.i::lmdb-repository) quad-pattern &rest args)
116
"Given an api repository, delegate to its storage"
117
(declare (dynamic-extent args))
118
(apply #'rlmdb:map-repository-statements operator (repository-lmdb-repository repository)
122
(:method :around (operator (repository rlmdb:repository) quad-pattern &rest args)
123
"For any storage repository, establish a transaction context and continue with the
124
methods applicable to its storage variant."
125
(declare (ignore args))
126
(cond ((and lmdb:*transaction*
127
(eq (lmdb:transaction-environment lmdb:*transaction*) repository))
128
;; (warn "reusing transaction: ~s" *transaction*)
131
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
132
:initial-disposition :begin :normal-disposition :abort
133
:error-disposition :abort)
134
(call-next-method)))))
136
;; the concrete methods allow for two circumstances:
137
;; - wild patterns scan the entire repository, which entails one index from each variant
138
;; - variant specific patterns match/scant the variant only
139
;;;!!! must construct the respective temporal / time-series / revision predicate
141
(:method (operator (repository rlmdb::quad-index-repository) quad-pattern &rest args
142
&key scan-order &allow-other-keys)
143
"This is intended to be the base method given the class precedence.
144
It always applies the index which matches the statement combination"
145
(let ((index-database (repository-quad-pattern-database repository quad-pattern :scan-order scan-order)))
146
(apply #'rlmdb::map-index-statements operator index-database quad-pattern args)))
148
(:method (operator (repository rlmdb::temporal-index-repository) quad-pattern &rest args
149
&key domain-predicate &allow-other-keys)
150
"If the pattern is includes a temporal predicate term, then use a temporal
151
index - those statements must be present in this index.
152
If a temporal predicate is supplied, then results are restricted to this index.
153
If the pattern is wild, then it includes everything"
154
(let* ((predicate-term (spocq.i:predicate quad-pattern))
155
(wild-pattern-p (wild-term-p predicate-term)))
156
(flet ((map-temporal-index ()
157
(let ((index-database (aref (repository-temporal-databases repository)
158
(temporal-pattern-key-map-index quad-pattern))))
159
(apply #'rlmdb::map-index-statements operator index-database quad-pattern args))))
160
(cond ((or (rlmdb:repository-temporal-predicate-p repository predicate-term)
162
(map-temporal-index))
167
(call-next-method))))))
169
(:method (operator (repository rlmdb::time-series-index-repository) quad-pattern &rest args
170
&key domain-predicate &allow-other-keys)
171
"Use a time-series index if
172
- the pattern is wild for the predicate term,
173
- the predicate term from the pattern is declared for time series,
174
- a predicate constraint sequence is provided, or
175
- there is a revision predicate to be applied to the indexed revision terms.
176
When either the predicate term is wild and no constraint set is provided, or
177
the pattern does not apply, continue with other indeices as well."
178
(let* ((predicate-term (spocq.i:predicate quad-pattern))
179
(wild-pattern-p (wild-term-p predicate-term)))
180
(flet ((map-time-series-index ()
181
(let ((index-database (aref (repository-time-series-databases repository)
182
(time-series-pattern-key-map-index quad-pattern))))
183
(apply #'rlmdb::map-index-statements operator index-database quad-pattern args))))
184
(cond ((or (rlmdb:repository-time-series-predicate-p repository predicate-term)
186
(map-time-series-index))
188
(map-time-series-index)
191
(call-next-method))))))
198
none {graph ?g {} } -2, !! this shouldnot yield the default graph
199
(test-sparql "select ?g ?s where {graph ?g {?s ?p ?o}}" :repository-id "james/foaf")
200
<urn:dydra:named> {} -2
201
<urn:dydra:named> {graph ?g {} } -2, but as there is no default graph bgp, no result
203
<urn:dydra:all> {graph ?g {} } 0
207
(defmethod rlmdb::map-index-statements (operator (index rlmdb::rdfcache-quad-database) (quad-pattern t)
208
&key revision-predicate)
209
"the logic for an rdfcache quad database computes the key as a mapped pattern
210
and maps the matched index entries back."
211
(lmdb:with-database (index)
212
(let* ((highest (rlmdb:find-last-ordinal index)) ;; see below for timing
213
(cur (lmdb:make-cursor index :transaction lmdb:*transaction*))
214
(named-only (case (graph quad-pattern)
215
((-2 |urn:dydra|:|named|) t)
217
(graph-none (case (graph quad-pattern)
218
((-4 |urn:dydra|:|none|) t)
220
(index-db-index (quad-pattern-key-map-index quad-pattern))
221
(key-maps (index-database-key-maps index))
222
(quad-map (aref key-maps index-db-index))
223
;; find the position which maps to the graph term
224
(quad-graph-index (position 0 quad-map :test #'=))
225
(wild-pattern-p (wild-quad-pattern-p quad-pattern))
228
;;(let ((%key-quad (cffi:foreign-alloc '(:struct spocq.i:quad))))
229
(cffi:with-foreign-objects ((%quad-pattern '(:struct spocq.i:quad))
230
(%key-quad '(:struct spocq.i:quad))
231
(%result-quad '(:struct spocq.i:quad)))
232
(lmdb::with-empty-value (raw-key)
233
(lmdb::with-empty-value (raw-value)
234
(flet ((map-for-graph (quad-pattern)
236
(incf spocq.i::*match-requests*)
237
(quad-to-term-number-key quad-pattern %quad-pattern quad-map)
238
(%copy-quad %quad-pattern %key-quad)
240
;;(%print-quad %quad-pattern *trace-output*)
241
;;(%print-quad %key-quad *trace-output*)
242
(lmdb:with-cursor (cur)
243
(let ((%cursor (lmdb::handle cur))
244
(visibility-unit (cffi:foreign-type-size :uint32)))
245
(labels ((get-quad (get-op)
246
;(lmdb::with-empty-value (raw-key)
247
;(lmdb::with-empty-value (raw-value)
250
(setf (cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-size) (cffi:make-pointer 16)
251
(cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-data) %key-quad)
252
;; (%print-quad %key-quad *trace-output*)
255
;;(print :key-quad *trace-output*)
256
;;(%print-quad %key-quad *trace-output*)
257
(let ((return-code (liblmdb:cursor-get %cursor
261
;; (print (list quad-pattern return-code))
262
(alexandria:switch (return-code)
264
(call-with-quad-entry raw-key raw-value))
268
(lmdb::unknown-error return-code)))))
269
(call-with-quad-entry (k v)
270
;;(print :call-with-quad-entry)
271
(assert (= 16 (%mdb-val-size k))
273
"key size is invalid: ~s" (%mdb-val-size k))
274
(let* ((%index-quad (%mdb-val-data k))
275
(visibility-bytes (%mdb-val-size v)))
276
;; continue until either no longer matched or the operator returns nil
277
;; (print (list :no named-only :qm quad-map :qgi quad-graph-index :g (cffi:mem-aref %index-quad 'term-id quad-graph-index)))
278
;; (%print-quad %index-quad *trace-output*)
279
(cond ((and named-only (= (cffi:mem-aref %index-quad 'term-id quad-graph-index) #xffffffff))
282
((or wild-pattern-p (%quad-match-p %quad-pattern %index-quad)) ;; iff still in range
284
(cond ((or (zerop visibility-bytes) ; not revisioned
285
(null revision-predicate)
286
;; for bi-temporal cases
287
(funcall revision-predicate (%mdb-val-data v) visibility-bytes))
288
;; if visible apply op and return the yes/no continue indication
289
(map-repository-statements-callback operator
290
(if (= index-db-index 0)
292
(term-number-key-to-term-number-quad %index-quad %result-quad quad-map)))
299
(loop for op = :+set-range+ then :+next+
302
do (incf count)))))))
303
(typecase (graph quad-pattern) ;; if a set enumerate, otherwise scan the single graph
304
(cons (loop with single-graph-quad-pattern = (copy-quad-pattern quad-pattern)
305
for graph in (graph quad-pattern)
306
do (progn (setf (graph single-graph-quad-pattern) graph)
307
(map-for-graph single-graph-quad-pattern))))
308
(t (unless graph-none
309
(when named-only ;; with special handling for named graphs
310
(setf (graph quad-pattern) 0))
311
(map-for-graph quad-pattern))))))))
312
;; should happen in the bgp processor, not here (incf spocq.i::*match-responses* match-count)
313
(values scan-count match-count))))
316
(defun rlmdb.i::map-repository-statements-callback (operator quad)
318
(when cl-user::*map-repository-statements-callback.verbose*
319
(let ((quad-string (with-output-to-string (stream) (spocq.i::%print-quad quad stream))))
320
(format *trace-output* "mrs: ~a" quad-string)
321
;(spocq.i::log-warn "mrs: ~a" quad-string)
323
;; for multi-threaded access, this effectively serialized the scans
324
;;(incf spocq.i::*match-responses* )
325
(funcall operator quad))
328
(defgeneric rlmdb:map-bitemporal-statements (operator repository pattern &key first last start end)
330
"Given a bi-temporal repository and a pattern which involves a temporal predicate and a constant
331
time value as the object, determine which temporal index to use based on the pattern
332
and the available indices and scan its matching statements.
333
If the start and/or end are specified, constrain the result to visibility.
334
The spog* indices are not used as the object timestamp dominates.")
336
(:method (operator (id string) pattern &rest args)
337
(declare (dynamic-extent args))
338
(apply #'rlmdb:map-bitemporal-statements operator (spocq.i:repository id) pattern args))
340
(:method (operator (repository rlmdb:bitemporal-repository) pattern &rest args &key start end first last)
341
(declare (ignore start end first last))
342
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
343
:initial-disposition :begin :normal-disposition :abort
344
:error-disposition :abort)
345
(let ((index-database (temporal-pattern-key-map-index repository pattern)))
346
(apply #'rlmdb:map-bitemporal-statements operator index-database pattern args)))))
349
(defmethod rlmdb:map-bitemporal-statements (operator (index rlmdb:temporal-index-database) (tquad-pattern t)
350
&key (first nil) (last nil) (start nil) (end nil))
351
"When invoked for a temporal index database, the pattern is intended to comprise five elements:
352
(context subject predicate object timestamp)
353
This is applied rendered to a rey record and used together with any first/last to govern the scan.
354
Matched keys are passed to given operator, which then uses just the c-s-p-o elements and ignores the
356
(unless first (setf first (rlmdb:find-last-ordinal index)))
357
(unless last (setf last first))
358
(lmdb:with-database (index)
359
(let* ((highest (rlmdb:find-last-ordinal index)) ;; see below for timing
360
(cur (lmdb:make-cursor index :transaction lmdb:*transaction*))
361
(named-only (case (graph tquad-pattern)
362
((-2 |urn:dydra|:|named|) t)
364
(graph-none (case (graph tquad-pattern)
365
((-4 |urn:dydra|:|none|) t)
367
;; find the position which maps tot he graph term
369
(wild-pattern-p (wild-quad-pattern-p tquad-pattern)) ;; at least the timestamp should be constant
372
;;(let ((%key-quad (cffi:foreign-alloc '(:struct spocq.i:quad))))
373
(cffi:with-foreign-objects ((%tquad-key '(:struct spocq.i::tquad))
374
(%result-quad '(:struct spocq.i::tquad)))
375
(lmdb::with-empty-value (raw-key)
376
(lmdb::with-empty-value (raw-value)
377
(flet ((map-for-graph (tquad-pattern)
379
(incf spocq.i::*match-requests*)
380
(tquad-to-quad-record tquad-pattern %tquad-key)
381
(lmdb:with-cursor (cur)
382
(let ((%cursor (lmdb::handle cur))
383
(visibility-unit (cffi:foreign-type-size :uint32)))
384
(labels ((get-quad (get-op)
385
;(lmdb::with-empty-value (raw-key)
386
;(lmdb::with-empty-value (raw-value)
389
(setf (cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-size) (cffi:make-pointer 24)
390
(cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-data) %tquad-key)
391
;; (%print-quad %key-quad *trace-output*)
394
;;(print :key-quad *trace-output*)
395
;;(%print-quad %key-quad *trace-output*)
396
(let ((return-code (liblmdb:cursor-get %cursor
400
;; (print (list quad-pattern return-code))
401
(alexandria:switch (return-code)
403
(call-with-tquad-entry raw-key raw-value))
407
(lmdb::unknown-error return-code)))))
408
(call-with-tquad-entry (k v)
409
;;(print :call-with-tquad-entry)
410
(assert (= 24 (%mdb-val-size k))
412
"key size is invalid: ~s" (%mdb-val-size k))
413
(let* ((%index-tquad (%mdb-val-data k))
414
(visibility-bytes (%mdb-val-size v))
415
(visibility-count (if (plusp visibility-bytes) (/ visibility-bytes visibility-unit) 0))
417
;; continue until either no longer matched or the operator returns nil
418
;; (print (list :no named-only :qm quad-map :qgi quad-graph-index :g (cffi:mem-aref %index-quad 'term-id quad-graph-index)))
419
;; (%print-quad %index-quad *trace-output*)
420
(cond ((and named-only (= (cffi:mem-aref %index-tquad 'term-id 0) #xffffffff))
421
;; skip default graph
423
((or wild-pattern-p (%quad-match-p %tquad-pattern %index-tquad)) ;; iff still in range
425
(cond ((and (oddp visibility-count) (>= first highest))
426
; head, just the last revision matters
427
(map-bitemporal-statements-callback operator
431
((setf position (%test-visibility-range first last (%mdb-val-data v) visibility-count))
432
(map-bitemporal-statements-callback operator
434
(cffi:mem-aref (%mdb-val-data v) :uint32 position)))
440
(loop for op = :+set-range+ then :+next+
443
do (incf count)))))))
444
(typecase (graph tquad-pattern) ;; if a set enumerate, other wise scan the single graph
445
(cons (loop with single-graph-tquad-pattern = (copy-tquad-pattern tquad-pattern)
446
for graph in (graph tquad-pattern)
447
do (progn (setf (graph single-graph-tquad-pattern) graph)
448
(map-for-graph single-graph-tquad-pattern))))
449
(t (unless graph-none
450
(when named-only ;; with special handling for named graphs
451
(setf (graph tquad-pattern) 0))
452
(map-for-graph tquad-pattern))))))))
453
;; should happen in the bgp processor, not here (incf spocq.i::*match-responses* match-count)
454
(values scan-count match-count))))
456
(defun rlmdb.i::map-bitemporal-statements-callback (operator quad ordinal)
458
(when cl-user::*map-repository-statements-callback.verbose*
459
(let ((quad-string (with-output-to-string (stream) (spocq.i::%print-quad quad stream))))
460
(format *trace-output* "mrs: ~a ~a" quad-string ordinal)
461
;(spocq.i::log-warn "mrs: ~a" quad-string)
463
(funcall operator quad ordinal))
466
;;; map over indices which index event identifiers in the respective value domain.
467
;;; these can be scanned so as to emit result in event order and, for some pattern combinations
468
;;; collate multi-statement patterns into solutions in-line, without joining,
470
(defgeneric rlmdb:map-repository-events (operator repository quad-pattern
476
"Given a repository which includes event indices, use the combination of
477
pattern specificity and sort precedence to determine which index to use.
478
With the goal, to constitute events, the index is chose to permits to
480
Once the index is chosen, delegate to its method.
481
If collation is intended, add a continuation to collate solutions.")
483
(:method (operator (repository rlmdb:repository) (pattern t) &rest args)
484
"The base method for a repository just returns nil to indicate no index applied"
485
(declare (ignore args))
488
(:method (operator (id string) (pattern t) &rest args)
489
"given a string operate on the dydra repository"
490
(declare (dynamic-extent args))
491
(apply #'rlmdb:map-repository-events operator (spocq.i:repository id) pattern args))
493
(:method (operator (transaction spocq.i::lmdb-transaction) (pattern t)
495
"Given an api transaction, delegate to its revision"
496
(apply #'rlmdb:map-repository-events operator (transaction-revision transaction)
500
(:method (operator (revision spocq.i::lmdb-revision) (pattern t)
502
&key revision-predicate
503
domain-predicate term-precedence)
504
"Given an api revision, capture its ordinal bounds use them to operate on the storage"
505
(declare (ignore domain-predicate term-precedence))
506
(apply #'rlmdb:map-repository-events operator (repository-lmdb-repository revision)
508
:revision-predicate (or revision-predicate
509
(compute-revision-predicate (list :first (spocq.i::revision-min-revision-ordinal revision)
510
:last (spocq.i::revision-max-revision-ordinal revision))))
513
(:method (operator (repository spocq.i::lmdb-repository) (pattern t) &rest args)
514
"Given an api repository, delegate to its storage"
515
(declare (dynamic-extent args))
516
(apply #'rlmdb:map-repository-events operator (repository-lmdb-repository repository)
520
(:method :around (operator (repository rlmdb:repository) (pattern t) &rest args)
521
"For any storage repository, establish a transaction context and continue with the
522
methods applicable to its storage variant."
523
(declare (ignore args))
524
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
525
:initial-disposition :begin :normal-disposition :abort
526
:error-disposition :abort)
529
;; the only valid access path is a pattern list.
530
;; match the pattern abstracted over predicates and collate g/s/p/e
531
(:method (operator (repository rlmdb::time-series-index-repository) (pattern spocq:quad) &rest args)
533
(apply #'rlmdb:map-repository-events operator repository
534
(quad-to-quad-record pattern (make-array 4))
536
(log-warn "rlmdb:map-repository-events: invalid pattern: ~s ~s ~s"
537
repository pattern args)
540
(:method (operator (repository rlmdb::time-series-index-repository) (pattern vector) &rest args)
542
(let ((index-database (repository-time-series-pattern-database repository quad-pattern :scan-order term-precedence)))
543
(apply #'rlmdb::map-index-statements operator index-database quad-pattern args))
544
(log-warn "rlmdb:map-repository-events: invalid pattern: ~s ~s ~s"
545
repository pattern args)
548
(:method (operator (repository rlmdb::time-series-index-repository) (patterns cons) &rest args
551
(declare (dynamic-extent args))
552
(let ((index-database (or (repository-scan-database repository (remove :predicate term-precedence))
553
(repository-time-series-pattern-database repository (first patterns)))))
554
(apply #'rlmdb::map-collated-index-statements operator index-database patterns args)))
556
;; other repository types should not get here
557
(:method ((operator t) (repository rlmdb::repository) quad-pattern &rest args)
558
(log-warn "rlmdb:map-repository-events: invoked without event indices: ~s ~s ~s"
559
repository quad-pattern args)
563
;;; event is either fixed, as a snapshot, or a sort criteria. as a criteria it should combine with the pattern to determine the index to permit collation
564
;;; other sort criteria should not effect the index, as the index terms are not in a value domain
565
;;; eg: (triple ?::s 1 ?::o).(?::s ?::e)
568
for pattern = (cons 'spocq.a:|quad|
569
(loop for bit downfrom 3 to 0
570
for var in '(?::s ?::p ?::o ?::c)
571
collect (if (logbitp bit i) var "constant")))
572
append (loop for order in (dsu:combinations '(?::s ?::p ?::graph ?::event))
573
collect (list pattern order))))
574
(precedence (loop for (pattern sort-order) in combis
575
collect (compute-term-precedence pattern sort-order '?::graph '?::event)))
576
(strings (loop for precedence in precedence
577
collect (make-array (length precedence) :element-type 'character
578
:initial-contents (loop for term-position in precedence
579
collect (char-downcase (char (symbol-name term-position) 0)))))))
580
;; (sort (remove-duplicates (mapcar #'third combis) :test #'string-equal) #'string-lessp))
581
;; (sort (remove-duplicates (mapcar #'third combis) :test #'string-equal) #'string-lessp))
582
(defparameter *maximal-indices* (remove-duplicates (remove-if #'(lambda (s) (find #\o s)) (sort strings #'string-lessp)) :test #'string-equal)))
585
(mapcar #'third combis)
588
(defparameter *minimal-indices* (sort (loop for name in rlmdb.i::+quad-database-names+ collect (substitute #\e #\o name)) #'string-lessp) )
591
(flet ((name-match (n1 n2) (loop for i below 3 for c across n1 unless (find c (subseq n2 0 3)) do (return nil) finally (return t))))
592
(loop for name in *maximal-indices* collect (cons name (loop for n2 in *minimal-indices* when (name-match name n2) collect n2)))))
594
(flet ((name-match (n1 n2) (loop for i below 2 for c across n1 unless (find c (subseq n2 0 2)) do (return nil) finally (return t))))
595
(loop for name in *maximal-indices* collect (cons name (loop for n2 in *minimal-indices* when (name-match name n2) collect n2)))))
597
(flet ((name-match (n1 n2) (loop for i below 3 for c across n1 unless (find c (subseq n2 0 3)) do (return nil) finally (return t))))
598
(loop for name in *minimal-indices* collect (cons name (loop for n2 in *maximal-indices* when (name-match name n2) collect n2)))))
600
(flet ((name-match (n1 n2) (loop for i below 2 for c across n1 unless (find c (subseq n2 0 2)) do (return nil) finally (return t))))
601
(loop for name in *minimal-indices* collect (cons name (loop for n2 in *maximal-indices* when (name-match name n2) collect n2)))))
603
;; to furnish streaming, there must be
604
;; for the single-node streams +event: "ge"/"gesp" "se"/+"segp" "pe"/"pesg" "e"/?
605
;; for dual node streams +event: "gse"/"segp" "gpe"/"gpes" "spe"/"speg" "g
606
;; for triad mode streams +event: "gspe"/"gspe"
607
;; quad index equivalents are : ("esgp" "gesp" "gpes" "gspe" "pesg" "speg")
613
;;; map over specific term positions
614
;;; context, subject, predicate or object, whereby the triple positions permit a qualifying context
615
;;; all use the spog index, as it is not shuffeled and then pick the term from the respective position
617
(defgeneric rlmdb::map-context-numbers (operator source &key distinct default
620
(:documentation "Invoke the given operator for each context term number in the repository.
621
:distinct (t) : indicate once-only per context
622
:default (t) : indicate to include the default context. in that case map -1 and #xfffffff both to -1
623
as this is what rdfcache recognizes while #xffffffff is taken as a possible value, but does not exist")
624
(:method (operator (repository spocq.i::lmdb-repository) &rest args)
625
(declare (dynamic-extent args))
626
(apply #'rlmdb::map-context-numbers operator (spocq.i::repository-lmdb-repository repository) args))
627
(:method (operator (repository rlmdb:repository) &key (distinct t) (default t) revision-predicate first last)
628
(unless first (setf first (rlmdb:find-last-ordinal repository)))
629
(unless last (setf last first))
630
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
631
:initial-disposition :begin :normal-disposition :abort
632
:error-disposition :abort)
633
(let ((cache (make-hash-table :test 'eql))
634
(predicate (or revision-predicate
635
(when (or first last) (spocq.i::compute-revision-predicate (list :first first :last last))))))
636
(labels ((collect-distinct-terms-w-default (%quad)
637
(let ((term (%quad-context %quad)))
638
(when (default-context-term-number-p term)
639
(setf term rlmdb:*default-context-number*))
640
(cond ((gethash term cache))
642
(setf (gethash term cache) t)
643
(funcall operator term)))))
644
(collect-distinct-terms-no-default (%quad)
645
(let ((term (%quad-context %quad)))
646
(cond ((gethash term cache))
647
((default-context-term-number-p term) t) ;; skip
649
(setf (gethash term cache) t)
650
(funcall operator term)))))
651
(collect-terms-w-default (%quad)
652
(let ((term (%quad-context %quad)))
653
(when (default-context-term-number-p term)
654
(setf term rlmdb:*default-context-number*))
655
(funcall operator term)))
656
(collect-terms-no-default (%quad)
657
(let ((term (%quad-context %quad)))
658
(cond ((default-context-term-number-p term) t) ;; skip
660
(funcall operator term))))))
661
(declare (dynamic-extent #'collect-distinct-terms-w-default #'collect-distinct-terms-no-default
662
#'collect-terms-w-default #'collect-terms-no-default))
663
(rlmdb::map-index-statements (if distinct
664
(if default #'collect-distinct-terms-w-default #'collect-distinct-terms-no-default)
665
(if default #'collect-terms-w-default #'collect-terms-no-default))
666
(rlmdb::repository-gspo-database repository)
668
:revision-predicate predicate))))))
670
(defgeneric rlmdb::map-subject-numbers (operator source &key distinct revision-predicate first last context)
671
(:method (operator (repository spocq.i::lmdb-repository) &rest args)
672
(declare (dynamic-extent args))
673
(apply #'rlmdb::map-subject-numbers operator (spocq.i::repository-lmdb-repository repository) args))
674
(:method (operator (repository rlmdb:repository) &key (distinct t) revision-predicate first last (context 0))
675
(unless first (setf first (rlmdb:find-last-ordinal repository)))
676
(unless last (setf last first))
677
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
678
:initial-disposition :begin :normal-disposition :abort
679
:error-disposition :abort)
680
(let ((cache (make-hash-table :test 'eql))
681
(predicate (or revision-predicate
682
(when (or first last) (spocq.i::compute-revision-predicate (list :first first :last last))))))
683
(flet ((collect-distinct-terms (%quad)
684
;; (print (rlmdb.i::term-number-record-to-vector %quad (vector 0 0 0 0)))
685
(let ((term (%quad-subject %quad)))
686
(cond ((gethash term cache))
688
(setf (gethash term cache) t)
689
(funcall operator term)))))
690
(collect-terms (%quad)
691
(funcall operator (%quad-subject %quad))))
692
(declare (dynamic-extent #'collect-distinct-terms #'collect-terms))
693
(rlmdb::map-index-statements (if distinct #'collect-distinct-terms #'collect-terms)
694
(rlmdb::repository-gspo-database repository)
695
(vector context 0 0 0)
696
:revision-predicate predicate))))))
698
(defgeneric rlmdb:map-predicate-numbers (operator source &key distinct revision-predicate first last context)
699
(:method (operator (repository spocq.i::lmdb-repository) &rest args)
700
(declare (dynamic-extent args))
701
(apply #'rlmdb::map-predicate-numbers operator (spocq.i::repository-lmdb-repository repository) args))
702
(:method (operator (repository rlmdb:repository) &key (distinct t) revision-predicate first last (context 0))
703
(unless first (setf first (rlmdb:find-last-ordinal repository)))
704
(unless last (setf last first))
705
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
706
:initial-disposition :begin :normal-disposition :abort
707
:error-disposition :abort)
708
(let ((cache (make-hash-table :test 'eql))
709
(predicate (or revision-predicate
710
(when (or first last) (spocq.i::compute-revision-predicate (list :first first :last last))))))
711
(flet ((collect-distinct-terms (%quad)
712
(let ((term (%quad-predicate %quad)))
713
;; (print (rlmdb.i::term-number-record-to-vector %quad (vector 0 0 0 0)))
714
(cond ((gethash term cache))
716
(setf (gethash term cache) t)
717
(funcall operator term)))))
718
(collect-terms (%quad)
719
(funcall operator (%quad-predicate %quad))))
720
(declare (dynamic-extent #'collect-distinct-terms #'collect-terms))
721
(rlmdb::map-index-statements (if distinct #'collect-distinct-terms #'collect-terms)
722
(rlmdb::repository-gspo-database repository)
723
(vector context 0 0 0)
724
:revision-predicate predicate))))))
726
(defgeneric rlmdb::map-object-numbers (operator source &key distinct revision-predicate first last context)
727
(:method (operator (repository spocq.i::lmdb-repository) &rest args)
728
(declare (dynamic-extent args))
729
(apply #'rlmdb::map-object-numbers operator (spocq.i::repository-lmdb-repository repository) args))
730
(:method (operator (repository rlmdb:repository) &key (distinct t) revision-predicate first last (context 0))
731
(unless first (setf first (rlmdb:find-last-ordinal repository)))
732
(unless last (setf last first))
733
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
734
:initial-disposition :begin :normal-disposition :abort
735
:error-disposition :abort)
736
(let ((cache (make-hash-table :test 'eql))
737
(predicate (or revision-predicate
738
(when (or first last) (spocq.i::compute-revision-predicate (list :first first :last last))))))
739
(flet ((collect-distinct-terms (%quad)
740
;; (print (rlmdb.i::term-number-record-to-vector %quad (vector 0 0 0 0)))
741
(let ((term (%quad-object %quad)))
742
(cond ((gethash term cache))
744
(setf (gethash term cache) t)
745
(funcall operator term)))))
746
(collect-terms (%quad)
747
(funcall operator (%quad-object %quad))))
748
(declare (dynamic-extent #'collect-distinct-terms #'collect-terms))
749
(rlmdb::map-index-statements (if distinct #'collect-distinct-terms #'collect-terms)
750
(rlmdb::repository-gspo-database repository)
751
(vector context 0 0 0)
752
:revision-predicate predicate))))))
758
(defgeneric rlmdb::map-repository-visibility-vectors (operator source &key context quad-pattern)
759
(:method (operator (repository spocq.i::lmdb-repository) &rest args)
760
(declare (dynamic-extent args))
761
(apply #'rlmdb::map-repository-visibility-vectors operator (spocq.i::repository-lmdb-repository repository) args))
763
(:method (operator (repository rlmdb:repository) &rest args)
764
(lmdb:with-transaction ((transaction (lmdb:make-transaction repository :flags liblmdb:+rdonly+))
765
:initial-disposition :begin :normal-disposition :abort
766
:error-disposition :abort)
767
(apply #'rlmdb::map-repository-visibility-vectors operator
768
(rlmdb::repository-gspo-database repository)
771
(:method (operator (index rlmdb:index-database) &key (context 0) (quad-pattern (vector context 0 0 0)))
772
(lmdb:with-database (index)
773
(let* ((cur (lmdb:make-cursor index :transaction lmdb:*transaction*))
774
(named-only (case (graph quad-pattern)
775
((-2 |urn:dydra|:|named|) t)
777
(quad-map (quad-pattern-key-map quad-pattern))
778
;; find the position which maps tot he graph term
779
(quad-graph-index (position 0 quad-map :test #'=))
780
(wild-pattern-p (wild-quad-pattern-p quad-pattern))
782
;;(let ((%key-quad (cffi:foreign-alloc '(:struct spocq.i:quad))))
783
(cffi:with-foreign-objects ((%quad-pattern '(:struct spocq.i:quad))
784
(%key-quad '(:struct spocq.i:quad))
785
(%result-quad '(:struct spocq.i:quad)))
786
(lmdb::with-empty-value (raw-key)
787
(lmdb::with-empty-value (raw-value)
788
(flet ((map-for-graph (quad-pattern)
789
(quad-to-quad-record quad-pattern %quad-pattern)
790
(%copy-quad %quad-pattern %key-quad)
792
;;(%print-quad %quad-pattern *trace-output*)
793
;;(%print-quad %key-quad *trace-output*)
794
(lmdb:with-cursor (cur)
795
(let ((%cursor (lmdb::handle cur))
796
(visibility-unit (cffi:foreign-type-size :uint32)))
797
(labels ((get-quad (get-op)
798
;(lmdb::with-empty-value (raw-key)
799
;(lmdb::with-empty-value (raw-value)
802
(setf (cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-size) (cffi:make-pointer 16)
803
(cffi:foreign-slot-value raw-key '(:struct liblmdb:val) 'liblmdb:mv-data) %key-quad)
804
;; (%print-quad %key-quad *trace-output*)
807
;;(print :key-quad *trace-output*)
808
;;(%print-quad %key-quad *trace-output*)
809
(let ((return-code (liblmdb:cursor-get %cursor
813
;; (print (list quad-pattern return-code))
814
(alexandria:switch (return-code)
816
(call-with-quad-entry raw-key raw-value))
820
(lmdb::unknown-error return-code)))))
821
(call-with-quad-entry (k v)
822
;;(print :call-with-quad-entry)
824
(assert (= 16 (%mdb-val-size k))
826
"key size is invalid: ~s" (%mdb-val-size k))
827
(let* ((%index-quad (%mdb-val-data k))
828
(visibility-bytes (%mdb-val-size v))
829
(visibility-count (if (plusp visibility-bytes) (/ visibility-bytes visibility-unit) 0)))
830
;; continue until either no longer matched or the operator returns nil
831
;; (print (list :no named-only :qm quad-map :qgi quad-graph-index :g (cffi:mem-aref %index-quad 'term-id quad-graph-index)))
832
;; (%print-quad %index-quad *trace-output*)
833
(cond ((and named-only (= (cffi:mem-aref %index-quad 'term-id quad-graph-index) #xffffffff))
836
((or wild-pattern-p (%quad-match-p %quad-pattern %index-quad))
839
(term-number-key-to-term-number-quad %index-quad %result-quad quad-map)
844
(loop for op = :+set-range+ then :+next+
847
do (incf count)))))))
848
(typecase (graph quad-pattern) ;; if a set enumerate, other wise scan the single graph
849
(cons (loop with single-graph-quad-pattern = (copy-quad-pattern quad-pattern)
850
for graph in (graph quad-pattern)
851
do (progn (setf (graph single-graph-quad-pattern) graph)
852
(map-for-graph single-graph-quad-pattern))))
853
(t (when named-only ;; with special handling for named graphs
854
(setf (graph quad-pattern) 0))
855
(map-for-graph quad-pattern)))))))
856
(incf spocq.i::*match-responses* match-count)
863
timing to retrieve last revision id indicates that to supply the index is notably faster.
864
that because the control path with the repository creates a new transaction each time.
865
als, there is not siginficant benefit from using the specific internal db, which is not diretly available at the call site.
867
* (let* ((lmdb-repo (repository-lmdb-repository (repository "james/test")))
868
(lmdb-index (aref (rlmdb.i::repository-index-databases lmdb-repo) 0))
869
(lmdb-meta (rlmdb.i::repository-meta-database lmdb-repo)))
870
(lmdb:with-transaction ((transaction (lmdb:make-transaction lmdb-repo :flags liblmdb:+rdonly+))
871
:initial-disposition :begin :normal-disposition :abort
872
:error-disposition :abort)
873
(time (dotimes (x 1000000)
874
(rlmdb:find-last-ordinal lmdb-index)))
875
(time (dotimes (x 1000000)
876
(rlmdb:find-last-ordinal lmdb-repo)))
877
(time (dotimes (x 1000000)
878
(rlmdb.i::get-metadata-property lmdb-index "revision-id")))
879
(time (dotimes (x 1000000)
880
(rlmdb.i::get-metadata-property lmdb-meta "revision-id")))))
883
1.044 seconds of real time
884
1.048000 seconds of total run time (0.988000 user, 0.060000 system)
886
3,654,709,193 processor cycles
887
175,982,592 bytes consed
890
5.516 seconds of real time
891
5.516000 seconds of total run time (5.428000 user, 0.088000 system)
892
[ Run times consist of 0.012 seconds GC time, and 5.504 seconds non-GC time. ]
894
19,307,475,468 processor cycles
895
512,032,768 bytes consed
898
0.802 seconds of real time
899
0.800000 seconds of total run time (0.776000 user, 0.024000 system)
901
2,806,877,074 processor cycles
902
175,982,608 bytes consed
905
0.780 seconds of real time
906
0.780000 seconds of total run time (0.768000 user, 0.012000 system)
908
2,731,689,523 processor cycles
909
175,982,592 bytes consed