Coverage report: /development/source/library/org/datagraph/spocq-shard/src/store/rdfcache/provenance.lisp
| Kind | Covered | All | % |
| expression | 185 | 338 | 54.7 |
| branch | 6 | 12 | 50.0 |
Key
Not instrumented
Conditionalized out
Executed
Not executed
Both branches taken
One branch taken
Neither branch taken
1
;;; -*- Mode: lisp; Syntax: ansi-common-lisp; Base: 10; Package: org.datagraph.spocq.implementation; -*-
3
(in-package :org.datagraph.spocq.implementation)
4
;; (load "/development/source/library/org/datagraph/spocq/src/store/property-paths.lisp")
7
(:documentation "provenance"
8
"Provenance concerns information about how an entity came to be and about its contributions towards the existence of others.
9
Dydra exposes meta-data about repositories sufficient to answer questions about
11
- retrospective repository state
12
- responsibility for change within users and query logic
13
Where the storage service provides transparent data aggregation capabilities through transparent federation,
14
such introspective capacity is necessary in order to substantiate and explicate results.
16
Lineage is first-order only, as more precise determination would require to compute dependency
17
based on the particular algebra expressed in a given query request.[cui]
18
Grain is not as file as it would need to be to derive the provenance of an arbitrary subgraph,[ding.2005] but we operate
19
on the assumption, that the graph-level granularity is set deliberate.
21
Each repository is reflected in its account's provenance store and includes properties in its
22
service description to aid discovery. Beyond that additional refrences appear in other documents
23
- a query response should include a header rel='provenance' with anchor being the repository revision. (prov-aq 3.1)
24
- a query response should include a header rel='provenance-service' with the anchor being the abstract repository uri of the account's provenance repository.
25
- the service description for the repository should include those links as well.
27
for html responses - eg. the query editor page, provenance and provenance-service links should be present along with an anchor link to the abstract repositiory, in the head.
28
for rdf responses, the prov:hasProvenance, prov:hasAnchor, and prov:hasProvenanceService properties should be incorporated into the encoded result. (prov-aq 3.2.1)
30
for plain get requests, the uri http://dydra.com/_account_/provenance designates the provenance repository.
31
each revision is present in the account provenance repository, at least in the statement
32
<revision-iri> prov:hasProvenance <provenance-iri>
34
Supporting explicit provenance records:
35
both command-line and request headers can include a provenance record argument.
36
this includes a string as an init argument for the query.
37
below this string is parsed to a graph, which is merged with the generated record.
38
;;;!!! need to ensure that the header/argument is integrated into the query initarg string.
39
;;;!!! it should be in the request, as the query is not available for gsp requests.
41
In the context of an RDF store, the entities are the things in the store. The
42
essential taxonomy comprises
46
- named graphs [dumbhill.2003],[carroll.2005] the first indication of their role as the "molecule" +[klyne.200],[reggiori.2003]
47
[pediatitis.2009]'s objection neglects first-class versions, which correspond to their graphsets.
48
also, wrt pediatitis's objection, the provenance of the inferred statement is not the set of graphs per-ser,
49
but rather the action which put those graphs together - thus the suitability of revisions/transactions
50
[halpin] offers a concrete implementation.
52
in addition provenance information must also describe essential activities
54
- tasks (query and update)
59
finally, in order to integrate application-specific information, the transaction annotation should
60
carry over an arbitrary graph specified as part of the query operation itself.
62
add a header to the http query response to locate the respective provenance information for the created revision
63
add a header to the head response to do the same for the latest revision
64
add operations (or sub-transaction types) to handle properties specific to creation, deletion, renaming. standard transaction is just read/write.[zhang.2012]
65
does clone/copy need special handling?
66
must guarantee stable names for provenance entities in order that any references in the information remain valid outside of the store
67
do we need lsid resolution in order to allow for store authority variation or should we just use linked-data utri conventions?
68
wrt their objections, first class revisions allow consistent interpretation under both the coherent and foundational semantics
69
given this simple schema
71
@prefix : <http://dydra.com/> .
72
@prefix cc: <http://creativecommons.org/ns#> .
73
@prefix premis: <http://multimedialab.elis.ugent.be/users/samcoppe/ontologies/Premis/premis.owl#> .
75
:Transaction rdfs:subClassOf prov:Activity .
76
:Revision rdfs:subClassOf prov:Entity .
77
:Graph rdfs:subClassOf prov:Entity .
78
:Account rdfs:subClassOf prov:Agent .
79
:Operation rdfs:subClassOf prov:Entity .
80
:Query rdfs:subClassOf :operation .
81
:Repository rdfs:subClassOf prov:Collection .
83
# describe provenance of entities within the complete dataset stored in the anAccount/aRepository repository
85
# a minimal description would be
87
# a repository is a collection of revisions
88
# of which the latest is be the effective revision
90
<http://dydra.com/anAccount/aRepository> rdf:type :Repository ;
92
<http://dydra.com/revision/7fd75320-285e-0130-6d7e-76cb6001b40d> ,
93
<http://dydra.com/revision/7fd75321-285e-0130-6d7e-76cb6001b40d> .
95
<http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d> rdf:type :Transaction .
96
<http://dydra.com/query/6b7c5700-285e-0130-23b9-76cb6001b40d> rdf:type :Query .
97
<http://dydra.com/revision/7fd75320-285e-0130-6d7e-76cb6001b40d> rdf:type :Revision .
98
<http://dydra.com/revision/7fd75321-285e-0130-6d7e-76cb6001b40d> rdf:type :Revision .
99
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphZero> rdf:type :Graph .
100
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphOne> rdf:type :Graph .
101
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphTwo> rdf:type :Graph .
102
<http://dydra.com/anAccount> rdf:type :Account .
104
# within the context of the repository (or, the revision?)
105
# probably the former as many entries span transactions
106
<http://dydra.com/anAccount/aRepository> {
107
# the empty repository was created first.
108
<http://dydra.com/anAccount/aRepository> prov:generatedAt prov:startedAtTime '20121231T000000Z'^^xsd:dateTime ;
110
# both a simple license
111
cc:license <http://creativecommons.org/licenses/by-nc-sa/3.0/de/> ; # for example
112
# and premis rights information
113
premis:hasObjectRightsStatement :_rightsStatement1 .
114
:_rightsStatement1 a premis:License;
115
premis:identifier <http://some.base.uri/rights/resource/dissemination>;
116
premis:licenseInformation <licenseInformation1> ;
117
premis:rightsGranted <rightsGranted1> ;
118
premis:linkingObject <http://dydra.com/anAccount/aRepository> ;
119
premis:linkingContact <http://dydra.com/anAccount>.
121
<licenseInformation1> a premis:LicenseInformation;
122
premis:identifier <http://some.base.uri/license/resource/dissemination>;
123
premis:licenseTerms "Text of the license.";
124
premis:licenseNote "These objects may be disseminated.".
126
<rightsGranted1> a premis:LicenseInformation;
127
premis:act <license1identifier>;
128
premis:termOfGrant <license1termofgrant>.
130
<license1termofgrant> a premis:TermOfGrant;
131
premis:startDate '20121231T000000Z'^^xsd:dateTime.
134
# describe the transaction
135
<http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d>
136
# sign the transaction
137
:signature "bf410f987571aa1932337621fd67eeee030e6465"
138
# specify the temporal bounds
139
prov:startedAtTime '20121231T235958Z'^^xsd:dateTime ;
140
prov:endedAtTime '20121231T235959Z'^^xsd:dateTime ;
141
# enumerate graphs which contributed to the effective dataset,
142
prov:used <www.w3.org/People/Berners-Lee/card.rdf> ,
143
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/aGraph> ;
144
# indicate the base revision,
145
prov:used <http://dydra.com/revision/7fd75320-285e-0130-6d7e-76cb6001b40d> ;
146
# the operations, and
147
prov:hadPlan <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d> ;
148
# the agent which executed the transaction.
149
prov:wasAssociatedWith <http://dydra.com/anAccount> ;
151
# identify the result revision
152
prov:generated <http://dydra.com/revision/7fd75321-285e-0130-6d7e-76cb6001b40d> .
155
<http://dydra.com/query/6b7c5700-285e-0130-23b9-76cb6001b40d>
156
prov:hadPlan <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d> ;
157
prov:value 'delete {?s ?p ?o } where { ?s a :statement }' .
159
# account for the revision in terms of the transaction
160
<http://dydra.com/revision/7fd75320-285e-0130-6d7e-76cb6001b40d>
161
prov:startedAtTime '20111231T235959Z'^^xsd:dateTime ; # would have been asserted in the previous transaction which created the revision
162
prov:endedAtTime '20121231T235959Z'^^xsd:dateTime ;
163
prov:wasInvalidatedBy <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d> .
164
<http://dydra.com/revision/7fd75321-285e-0130-6d7e-76cb6001b40d>
165
prov:startedAtTime '20121231T235959Z'^^xsd:dateTime
166
prov:wasRevisionOf <http://dydra.com/revision/7fd75320-285e-0130-6d7e-76cb6001b40d> ;
167
prov:wasGeneratedBy <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d> .
169
# describe the effect on graphs modified by the transaction
170
# this would include graphs deleted as well as graphs added or otherwise modified
171
# if a constituent named graph is deleted
172
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphZero> prov:wasInvalidatedBy <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d>
173
# if a constituent named graph is introduced by a transaction
174
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphOne> prov:wasGeneratedBy <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d>
175
# if a constituent named graph is modified by a transaction
176
<http://dydra.com/anAccount/aRepository?graph=http:%3//example.org/graphTwo> prov:wasInfluencedBy <http://dydra.com/transaction/6b7c5700-285e-0130-23b9-76cb6001b40d>
179
how do various ontology models/ontologies and use cases map onto this implementation?
183
w7 : is overblown. not necessary to distinguish exactly seven hard roles. in the web one needs just a partial order of
184
combined locations (whats), where just one location need be distinguised (how, the operation) and the other roles are combined
185
as specified by the operation. what, how, when, where, who, which, wny.
187
what : (sioca:Action) :Transaction: sioca:creates, sioca:modifies: relate to revisions
188
how : (diff:Diff) :Transaction: diff:removal, diff:addition: relate to statements; sioc:has_creator: relates to :Account; diff:objectOfChange relates to revision;
190
:Account owl:sameAs w7:Which
191
:Account qol:sameAs w7:Who
193
:timestamp owl:sameAs dc:created
195
:Accounts owl:sameAs siod:UserAccount
197
prov:hadPlan owl:sameAs ?
199
prov:wasInfluencedBy with attributes for support, detraction, and related entity (w3c mapping notes that diff:reason is not present
201
shiyong-2010 suggests optimization strategies:
202
- join-elimination due to inherent temporal dependency between definition and execution in workflows
203
- trade off update time for efficient query time as the experiments are long-runn and thus the update rate is lower than the query rate
204
- optimize foe frequent queries
205
- all updates are append-onl; there is no deletion and no modification (seems to ignore corrections which bi-temporal phenomena should recognize)
206
suggests that 'data lineage' is not sufficient
207
? to what extent is graph-based provenance sufficient to inform the 'PO' based inference rules. This reduces to the question: to what extent
208
can the annotations of 'PO' apply to just the transaction or the graph, -> intrinsic provenance must permit user-lavel annotations on eithre of those two things.
209
- they argue the only t-box support need be : subClass, transitiveProperty, and symmetric property. this places "dependency graph inference" =? logical
210
among graphs due to sequence relation?
211
they use bgp-based inference rules.
214
event object rights agent
216
[tan.2007] distinguished data from workflow provenance and for the former annotation v/s non-annotation. the mechanism here is annotation-data provenance.
217
the graph-based annotation combines with the cached query to supplly both why and where annotation.
219
[groth.2012] collects requirements in three dimensions : content model, adminstration, and use,
220
which would be a godd framework to review an implementation
223
? to what extend would it support a general provenance-based algebra which aims to compute properties such as "disclosure" and "obfuscation"
225
coppens : where do they want to put metadata? via OAI-PMH?
226
sam.coppens@ugent.be ? alternative triple-store as back-end
228
hartig[hartig.trust] proposes model and algebra additions, but it would be sufficient to permit the query to
229
bind arbitrary properties to the executed transaction. these properties would htn be available in the
230
bgp-matched solution set for use in filters and aggregate operations. it would not even benecessary to add a function to retrieve them.
232
in order to limit throughput consequences, provenance data could be written asynchronously to transaction completion.
233
this would mean it could be incomplete - or eventually complete if a post-crash reconstruction stel was included.
236
hartig, finin, ding, moreau, (other opm/prov authors). mccusker,
238
hartig's term for this provenance information is "recordable pi", as opposed to "metadata pi", which must be supplied from
242
[] : http://www.w3.org/TR/sparql11-http-rdf-update/#direct-graph-identification
243
[] : http://jamesrdf.blogspot.ca/2012/10/provenance-and-traceability-in.html
244
[] : http://twiki.ipaw.info/bin/view/Challenge/WebHome
245
[opm] : http://eprints.ecs.soton.ac.uk/21449/
246
[w7] : http://people.csail.mit.edu/pcm/tempISWC/workshops/SWPM2010/InvitedPaper_7.pdf,
247
http://cs5235.userapi.com/u133638729/docs/1a5fafa081eb/Peter_P_Chen_Active_Conceptual_Modeling_of_Lear.pdf#page=25
248
[premis] : http://multimedialab.elis.ugent.be/users/samcoppe/ontologies/Premis/index.html,
249
http://www.loc.gov/standards/premis/
250
[hartig.trust] 2009_eswc_hartig_preprint.pdf
251
[cui.2001] : ftp://db.stanford.edu/pub/dbpubs/2001/56/56.pdf.gz
252
[ding.2005] : ftp://www.ksl.stanford.edu/local/pub/KSL_Reports/KSL-05-06.pdf
253
[tan.2007] : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.109.3078&rep=rep1&type=pdf#page=5
254
[groth.2012] : http://www.ijdc.net/index.php/ijdc/article/download/203/272
255
[zhang.2012] http://www.hpl.hp.com/techreports/2012/HPL-2012-109.pdf
256
[carroll.2005] : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.2197&rep=rep1&type=pdf
257
[klyne.2000] : http://www.ninebynine.org/RDFNotes/RDFContexts.html
258
[dumbhill.2003] : http://www.ibm.com/developerworks/xml/library/x-rdfprov/index.html
259
[reggiori.2003] : http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.200.9421&rep=rep1&type=pdf (w/ ref to dumbhill)
260
[pediatitis.2009] : http://static.usenix.org/event/tapp09/tech/full_papers/pediaditis/pediaditis.pdf
265
(defgeneric task-provenance-repository (task)
266
(:documentation "Given a task, return either its directly specified provenance repository id
267
or that of the involved repository - which, in turn should delegate to the account.")
268
(:method ((task task))
269
(or (metadata-provenance-repository-id task)
270
(metadata-provenance-repository-id (task-repository task))))
271
(:method ((task null))
275
(defgeneric process-provenance-information (task)
276
(:method ((task null))
277
;; a default method, when called outside a task context
280
(:method ((query query))
281
;; if the query was an update and it specifies provenance recording, write the provenance information
282
(log-debug "process-provenance-information: ~s" query)
283
(case *provenance-mode*
284
(|urn:dydra|:|internal|
285
(unless (operation-read-only-p query)
286
(let ((p-repository (task-provenance-repository query))
288
;; allow active suppression with the string value "nil"
289
(cond ((not (member p-repository '(nil "NIL" "nil") :test #'equal))
290
(log-debug "provenance repository: ~s" p-repository)
291
(write-provenance-information query p-repository))
293
(log-debug "no provenance repository"))))))
295
(log-debug "provenance disabled"))
297
(log-warn "Invalid provenance mode: ~s." *provenance-mode*)))))
300
(defgeneric write-provenance-information (source destination)
301
(:documentation "Given a REPOSITORY designator and some abstract or concrete provenance INFORMATION,
302
re-write the information into a concrete statement sequence which captures the inputs and operations
303
which constitute the transaction and insert the statements into the given repository's respective
304
provenance repository in the context of the given repository as graph.")
305
(:argument-precedence-order destination source)
307
(:method ((source t) (repository-designator string))
309
(with-open-repository (repository-designator :normal-disposition :commit
310
:id (make-provenance-task-id)
312
(write-provenance-information source *repository*)
313
(commit-transaction *transaction*))
314
(spocq.e:repository-not-found-error (c)
317
(error "Invalid provenance storage: ~a, ~a." repository-designator c))))
319
(:method ((source null) (destination t))
322
(:method ((source t) (destination null))
325
(:method ((query query) (repository repository))
326
;; to this point, access is already authorized. now just restrict further
327
;; to be in the same account
328
(assert (equal (account-name (repository-account (task-repository query)))
329
(account-name (repository-account repository)))
331
"Invalid provenance storage: must be identical account: ~a, ~a."
332
(repository-id repository) (repository-id (task-repository query)))
333
(assert (not (equal (task-repository query) repository)) ()
334
"Invalid provenance storage: Reflexive is not permitted: ~a." (repository-id repository))
335
(let ((info (compute-provenance-information query)))
336
(write-provenance-information info repository)
337
(log-info "task ~s. provenance recorded: ~s statements for ~s"
338
(task-id query) (length info) (repository-id repository))))
340
(:method ((information cons) (repository repository))
341
(repository-insert-field repository (repository-intern-statements repository information))
344
(:method ((information cons) (stream stream))
345
(write-rdf-nquads information stream)))
348
(defgeneric compute-provenance-information (context)
349
(:documentation "Return the provenance information for the given query as
350
a statement list. THis comprises several triples to be placed in the default graph
351
and a transaction-uri context with statements specific to the transaction.")
353
(:method ((query query))
354
(log-debug "compute-provenance-information: ~s" query)
355
(let* ((repository (task-repository query))
356
(repository-uri (repository-identifier repository))
357
(repository-account-uri (account-identifier repository))
358
(transaction (task-transaction query))
359
(transaction-uri (transaction-uri transaction))
360
(revision-uri (revision-uri transaction))
361
(revision-signature (revision-signature transaction))
362
(query-uuid (task-uuid query))
363
(request-provenance-record (query-provenance-record query))
364
(parent-revision-uri (revision-parent-uri transaction))
365
#+(or) ; the declared graphs do not signify as
366
; a, they may or may not have been incorporated
367
; b, a wildcard graph could introduce others
368
(graphs (loop for graph in (destructuring-bind (default named) (query-graphs query)
369
(union default named :test #'equalp))
370
collect (graph-indirect-uri repository graph)))
371
(created-graphs (loop for graph in (list-transaction-created-graphs transaction)
372
collect (graph-indirect-uri repository graph)))
373
(deleted-graphs (loop for graph in (list-transaction-deleted-graphs transaction)
374
collect (graph-indirect-uri repository graph)))
375
(modified-graphs (loop for graph in (list-transaction-modified-graphs transaction)
376
collect (graph-indirect-uri repository graph)))
377
(read-graphs (loop for graph in (list-transaction-read-graphs transaction)
378
collect (graph-indirect-uri repository graph)))
379
(query-account (task-account query))
380
(query-account-uri (when query-account (account-identifier query-account)))
381
(license (query-license query)) ;; eg: <http://creativecommons.org/licenses/by-nc-sa/3.0/de/>
382
(start-timestamp (universal-time-date-time (transaction-start-time transaction)))
383
(end-timestamp (let ((end-time (transaction-end-time transaction)))
384
(when end-time (universal-time-date-time end-time))))
385
;; use query id as blank node prefix to make them unique
386
(licence-node (intern-blank-node (format nil "p-~a-1" (task-id query))))
387
(licence-information-node (intern-blank-node (format nil "p-~a-2" (task-id query))))
388
(grant-node (intern-blank-node (format nil "p-~a-3" (task-id query))))
389
(signature (query-signature query))
390
(user-id (task-user-id query))
391
(agent (task-agent query))
392
(query-agent-uri (when agent (agent-identifier agent))))
394
`(;; global - account-wide information
395
;; assert the the repository and the account as such
396
(spocq.a:|triple| ,repository-uri |rdf|:|type| <urn:dydra:Repository>)
397
(spocq.a:|triple| ,repository-account-uri |rdf|:|type| <urn:dydra:Account>)
398
(spocq.a:|triple| ,repository-uri |prov|:|wasAssociatedWith| ,repository-account-uri)
399
;; assert the specific account associated with the transaction, if specified
400
,@(when query-account-uri
401
`((spocq.a:|triple| ,query-account-uri |rdf|:|type| <urn:dydra:Account>)))
402
;; note entities - transaction, query, and revision in the repositry context
403
(spocq.a:|graph| ,repository-uri
404
((spocq.a:|triple| ,transaction-uri |rdf|:|type| <urn:dydra:Transaction>)
405
(spocq.a:|triple| ,transaction-uri |rdf|:|type| |prov|:|Activity|)
406
(spocq.a:|triple| ,revision-uri |rdf|:|type| <urn:dydra:Revision>)
407
(spocq.a:|triple| ,query-uuid |rdf|:|type| <urn:dydra:Query>)
409
;; everything else is in the context of the transaction
410
(spocq.a:|graph| ,transaction-uri
411
((spocq.a:|triple| ,transaction-uri |prov|:|generated| ,revision-uri)
412
(spocq.a:|triple| ,transaction-uri |prov|:|hadPlan| ,query-uuid)
413
(spocq.a:|triple| ,repository-uri |prov|:|hadMember| ,revision-uri)
414
;; when this is the first revision, record repository creation
416
,@(unless parent-revision-uri
417
`((spocq.a:|triple| ,repository-uri <http://www.w3.org/ns/prov#generatedAtTime> ,start-timestamp)))
418
;; ensure that graphs are known respective the repository
419
,@(loop for graph in (remove-duplicates (append created-graphs deleted-graphs modified-graphs read-graphs))
420
collect `(spocq.a:|triple| ,graph |rdf|:|type| <urn:dydra:Graph>))
421
;; include transaction-specific rights
425
`((spocq.a:|triple| ,repository-uri <http://creativecommons.org/ns#license> ,license)
426
(spocq.a:|triple| ,repository-uri |premis|:|hasObjectRightsStatement| ,licence-node)
427
(spocq.a:|triple| ,licence-node |rdf|:|type| |premis|:|License|)
428
(spocq.a:|triple| ,licence-node |premis|:|identifier| ,license)
429
(spocq.a:|triple| ,licence-node |premis|:|rightsGranted| ,licence-information-node)
430
(spocq.a:|triple| ,licence-information-node |rdf|:|type| |premis|:|LicenseInformation|)
431
(spocq.a:|triple| ,licence-information-node |premis|:|termOfGrant| ,grant-node)
432
(spocq.a:|triple| ,grant-node |rdf|:|type| |premis|:|TermOfGrant|)
433
(spocq.a:|triple| ,grant-node |premis|:|startDate| ,start-timestamp))))
434
;; describe the transaction
435
,@(when revision-signature
436
`((spocq.a:|triple| ,revision-uri <urn:dydra:signature> ,revision-signature)))
437
(spocq.a:|triple| ,transaction-uri |prov|:|startedAtTime| ,start-timestamp)
438
;; if a transaction was aborted, then there is no end-timestamp
439
,@(when end-timestamp
440
`((spocq.a:|triple| ,transaction-uri |prov|:|endedAtTime| ,end-timestamp)))
441
,@(when parent-revision-uri
442
`((spocq.a:|triple| ,transaction-uri |prov|:|used| ,parent-revision-uri)
443
(spocq.a:|triple| ,parent-revision-uri |prov|:|wasUsedBy| ,transaction-uri)
444
(spocq.a:|triple| ,revision-uri |prov|:|wasRevisionOf| ,parent-revision-uri)
446
,@(when query-account-uri
447
`((spocq.a:|triple| ,transaction-uri |prov|:|wasAssociatedWith| ,query-account-uri)))
449
;; (spocq.a:|triple| ,query-uuid |prov|:|value| ,(query-sparql-expression query))
451
`((spocq.a:|triple| ,query-uuid <urn:dydra:signature> ,signature)))
453
`((spocq.a:|triple| ,query-uuid <urn:dydra:user_id> ,user-id)))
454
,@(when query-agent-uri
455
`((spocq.a:|triple| ,query-uuid |prov|:|wasAttributedTo| ,query-agent-uri)))
456
(spocq.a:|triple| ,revision-uri |prov|:|startedAtTime| ,start-timestamp)
457
(spocq.a:|triple| ,revision-uri |prov|:|wasGeneratedBy| ,transaction-uri)
459
;; track changes to graphs
460
;; rather than the generic `(spocq.a:|triple| ,transaction-uri |prov|:|used| ,graph), this
461
;; differentiates the association
462
,@(loop for graph in created-graphs
463
collect `(spocq.a:|triple| ,graph |prov|:|wasGeneratedBy| ,transaction-uri))
464
,@(loop for graph in deleted-graphs
465
collect `(spocq.a:|triple| ,graph |prov|:|wasInvalidatedBy| ,transaction-uri))
466
,@(loop for graph in modified-graphs
467
collect `(spocq.a:|triple| ,graph |prov|:|wasInfluencedBy| ,transaction-uri))
470
,@(loop for graph in read-graphs
471
collect `(spocq.a:|triple| ,revision-uri |prov|:|wasDerivedFrom| ,graph))))
472
,@(when parent-revision-uri
473
;; as the revision is not an prov:Activity, prov:invalidatedAtTime applies
474
;; rather than prov:endedAtTime
475
`((spocq.a:|graph| ,parent-revision-uri
476
((spocq.a:|triple| ,parent-revision-uri |prov|:|wasInvalidatedBy| ,transaction-uri)
477
(spocq.a:|triple| ,parent-revision-uri |prov|:|invalidatedAtTime| ,end-timestamp)))))
478
;; parse is delayed to this point in order for it to happen in the context of any
479
;; query metadata - base and prefixes
480
,@(when request-provenance-record
481
(loop for form in (parse-quads request-provenance-record)
483
do (cond ((triple-form-p form)
485
((and (listp form) (every #'triple-form-p form))
486
(setf triples (append form triples)))
488
(log-warn "compute-provenance-information: anomalous request record: ~s" form)))
489
finally (when triples
490
(return `((spocq.a:|graph| ,transaction-uri ,@triples)))))))))
492
(:method ((repository repository))
493
"compute a provenance record for a recentyl modified repository based on the information available
494
from examining the most recent transaction."
495
(compute-repository-provenance-information repository)))
497
(defgeneric compute-repository-provenance-information (repository &key graph operation user-id)
498
(:method ((repository-id string) &rest args)
499
(apply #'compute-repository-provenance-information (repository repository-id) args))
501
(:method ((repository repository) &key graph (operation :PUT) user-id
502
(revision-class (repository-revision-class repository)))
503
(log-debug "compute-provenance-information: ~s" repository)
504
(flet ((ensure-revision (revision-id)
505
(or (get-registry revision-id *repositories*)
506
(setf (get-registry revision-id *repositories*)
507
(make-instance revision-class :revision-id revision-id :reference repository)))))
508
(let* ((repository-id (repository-id repository))
509
(repository-uri (repository-identifier repository))
510
(repository-account-uri (account-identifier repository))
511
(revision (repository-revision "HEAD" :reference repository-id))
512
(parent-revision (repository-revision "HEAD~1" :reference repository-id))
513
(revision-id (repository-revision-id revision))
514
(revision-uri (repository-revision-uri revision))
515
;; not used (parent-revision-id (when parent-revision (repository-revision-id parent-revision)))
516
(parent-revision-uri (when parent-revision (revision-uri parent-revision)))
517
(transaction-uri (compute-transaction-uri revision-id))
518
(revision-signature (revision-signature revision)) ; ?
519
(graphs (when graph (list (graph-indirect-uri repository graph))))
520
(created-graphs (case operation
522
(deleted-graphs (case operation
524
(modified-graphs (case operation
525
((:patch :post) graphs)))
527
;; the start of the revision is the end of the transaction
528
(end-timestamp (repository-revision-start-date-time revision)))
530
`(;; global - account-wide information
531
;; assert the the repository and the account as such
532
(spocq.a:|triple| ,repository-uri |rdf|:|type| <urn:dydra:Repository>)
533
(spocq.a:|triple| ,repository-account-uri |rdf|:|type| <urn:dydra:Account>)
534
(spocq.a:|triple| ,repository-uri |prov|:|wasAssociatedWith| ,repository-account-uri)
535
;; note entities - transaction and revision in the repository context
536
(spocq.a:|graph| ,repository-uri
537
((spocq.a:|triple| ,transaction-uri |rdf|:|type| <urn:dydra:Transaction>)
538
(spocq.a:|triple| ,transaction-uri |rdf|:|type| |prov|:|Activity|)
539
(spocq.a:|triple| ,revision-uri |rdf|:|type| <urn:dydra:Revision>)
541
;; everything else is in the context of the transaction
542
(spocq.a:|graph| ,transaction-uri
543
((spocq.a:|triple| ,transaction-uri |prov|:|generated| ,revision-uri)
544
(spocq.a:|triple| ,repository-uri |prov|:|hadMember| ,revision-uri)
545
;; when this is the first revision, record repository creation
547
,@(unless parent-revision-uri
548
`((spocq.a:|triple| ,repository-uri <http://www.w3.org/ns/prov#generatedAtTime> ,end-timestamp)))
549
;; ensure that graphs are known respective the repository
550
,@(loop for graph in (remove-duplicates (append created-graphs deleted-graphs modified-graphs read-graphs))
551
collect `(spocq.a:|triple| ,graph |rdf|:|type| <urn:dydra:Graph>))
552
;; describe the transaction
553
,@(when revision-signature
554
`((spocq.a:|triple| ,revision-uri <urn:dydra:signature> ,revision-signature)))
555
(spocq.a:|triple| ,transaction-uri |prov|:|startedAtTime| ,end-timestamp)
556
;; given the revision, the transaction was complete, but it is not possibel to
557
;; distinguish the start from the end time
558
(spocq.a:|triple| ,transaction-uri |prov|:|endedAtTime| ,end-timestamp)
559
,@(when parent-revision-uri
560
`((spocq.a:|triple| ,transaction-uri |prov|:|used| ,parent-revision-uri)
561
(spocq.a:|triple| ,parent-revision-uri |prov|:|wasUsedBy| ,transaction-uri)
562
(spocq.a:|triple| ,revision-uri |prov|:|wasRevisionOf| ,parent-revision-uri)
565
`((spocq.a:|triple| ,transaction-uri <urn:dydra:user_id> ,user-id)))
566
(spocq.a:|triple| ,revision-uri |prov|:|startedAtTime| ,end-timestamp)
567
(spocq.a:|triple| ,revision-uri |prov|:|wasGeneratedBy| ,transaction-uri)
569
;; track changes to graphs
570
;; rather than the generic `(spocq.a:|triple| ,transaction-uri |prov|:|used| ,graph), this
571
;; differentiates the association
572
,@(loop for graph in created-graphs
573
collect `(spocq.a:|triple| ,graph |prov|:|wasGeneratedBy| ,transaction-uri))
574
,@(loop for graph in deleted-graphs
575
collect `(spocq.a:|triple| ,graph |prov|:|wasInvalidatedBy| ,transaction-uri))
576
,@(loop for graph in modified-graphs
577
collect `(spocq.a:|triple| ,graph |prov|:|wasInfluencedBy| ,transaction-uri))
580
,@(loop for graph in read-graphs
581
collect `(spocq.a:|triple| ,revision-uri |prov|:|wasDerivedFrom| ,graph))))
582
,@(when parent-revision-uri
583
;; as the revision is not an prov:Activity, prov:invalidatedAtTime applies
584
;; rather than prov:endedAtTime
585
`((spocq.a:|graph| ,parent-revision-uri
586
((spocq.a:|triple| ,parent-revision-uri |prov|:|wasInvalidatedBy| ,transaction-uri)
587
(spocq.a:|triple| ,parent-revision-uri |prov|:|invalidatedAtTime| ,end-timestamp))))))))))
588
;;; (pprint-sse (compute-repository-provenance-information "openrdf-sesame/mem-rdf"))
589
;;; (pprint-sse (compute-repository-provenance-information "openrdf-sesame/mem-rdf" :graph <http://example.org> :operation :delete))
590
;;; (pprint-sse (compute-repository-provenance-information "openrdf-sesame/mem-rdf" :graph <http://example.org> :operation :patch))
591
;;; (pprint-sse (compute-repository-provenance-information "openrdf-sesame/mem-rdf" :graph <http://example.org> :operation :post))
592
;;; (pprint-sse (compute-repository-provenance-information "openrdf-sesame/mem-rdf" :graph <http://example.org> :operation :put))
596
/opt/dydra/bin/sbcl --core /opt/dydra/lib/spocq/sbcl-spocq.core --spocqinit /opt/dydra/lib/spocq/init.sxp --content-type application/sparql-query --accept application/sparql-results+term-number
597
(in-package :spocq.i)
598
(trace write-provenance-information task-provenance-repository process-provenance-information respond-to-task pipe-query query-run-in-thread send-query-response
599
RDFCACHE:COMMIT-TRANSACTION)
600
(spocq.i::run-query-loop-once)
601
((:task-id "0a780aed-1620-4134-a328-111111111134") (:repository-id "6/1648") (:query-signature "fc716949cf4f00620c998b0f9c3e711a864eeff2"))
603
PREFIX provenanceRepositoryID: <jhacker/726-provenance>
604
INSERT DATA { <http://example.org/uri1/one> <foaf:name> 'object-24646' .
605
<http://example.org/uri1/one> rdf:type <http://example.org/thing> .