Skip to main content

Hacking handles into Logseq PDF citations

Background

Logseq has relatively best-in-class PDF annotation support. My biggest problem was that all PDF citations I don't know that Logseq has a name for the annotation link. Here, I call it a "citation". only show the page number of the reference. If you refer to multiple documents, you need to click through to tell which one refers to which document!

Before

Logseq screenshot showing PDF references composed only of the letter P and the page number of the reference

Each yellow circle indicates a PDF citation. The number is the page number of the PDF (a good start!) but which PDF?

After

Logseq screenshot showing PDF references composed of a handle like "HtN" and "GtN", the letter P, and the page number of the reference

With a handle like “HtN” and “GtN”, I can tell which citation refers to which document.

Asking the Community

I posted in the Logseq Discord:

Hiya folks! I’m looking to extend the (colored dot) P annotation handle thing. I really want to pull in a pdf name—maybe defined as a property on the annotation page? i.e. instead of * P240 “hello world” it may say something like * Logseq for dummies P240 “hello world”. Is this possible? Anyone have any tips or pointers? I am a pretty competent programmer but haven’t done Logseq plugin-y work before.

Someone named Charlie jumped in, and said:

Cool! That’s a great idea! The current plugin SDK doesn’t provide an API to intercept the rendering of PDF annotation blocks. Even if it is possible to achieve through hacking, it may be complicated and have performance issues. We will consider implementing this feature natively or providing a specific plugin API as soon as possible.

Phrased like a seasoned open source developer!

Unfortunately, I have to quickly pick the underlying tool for an annotation project. There are no clear winners, but Logseq is at the top. This is one of the main blockers—if I can hack this feature in, even in a hard-to-maintain and performance-impacting way, it may be enough to make me feel comfortable using Logseq for the project. Hopefully, by the time I’m done with the project, Logseq won’t have this problem anymore.

Hacking

I haven’t worked with Logseq’s internals before, and I don’t think I’ve worked with Clojure. I’ve done some work in Lisp-y environments, though, so I opened up the source. Within a few minutes, I was able to find where the citations get added (in block.cljs). I could tell that the section didn’t directly have the document or document name, but it did have the document ID. Time for hacking!

I set up the development environment.

This was not done with any real knowledge of Clojure or Logseq internals, unfortunately, so please don’t take this as a great example of how to do Clojure work! I saw reference to a REPL, but the Electron change loop was pretty fast, and I wasn’t sure I had the Clojure knowledge to translate things from the REPL into a plugin or patch to the source.

While looking for examples of getting properties through a database lookup, I found some code in Logseq that was using debug/pprint. The advantage of using pprint over println appeared to be that pprint would indent the output to show the structure of the data better.

I incrementally found the reference to the PDF document, got a property of that page through a database lookup, and then rendered it in the citation.

(debug/pprint "adamwolf: t" t)
(debug/pprint "adamwolf: (:block/page t)" (:block/page t))
(debug/pprint "adamwolf xyzzy: (db-utils/pull (:db/id (:block/page t)))"
							(db-utils/pull (:db/id
															 (:block/page t))) )
(debug/pprint "adamwolf (db-utils/pull (:block/page t))"
							(:hl-handle
								(:block/properties
									db-utils/pull
									(:db/id
										(:block/page t)))))

I added some comments to explain my intent. You can see the change in context.

;; Get the ID of the page referred to by t's :block/page.
;;
;; Use that ID to pull the page from the database, and only take the
;; :block/properties map.
;;
;; Look up the :hl-handle in the :block/properties, and bind it
;; to awolf-hl-handle.
;;
;; If any of those things are nil or don't exist, awolf-hl-handle is nil.
;;
;; Then, if awolf-hl-handle is not nil, render a span.hl-handle with
;; the handle and a space in it.

(let [awolf-hl-handle (:hl-handle
                        (:block/properties
                          (db-utils/pull [:block/properties]
                                         (:db/id (:block/page t)))))]
  (when awolf-hl-handle
    [:span.hl-handle
      [:strong.forbid-edit (str awolf-hl-handle " ")]]
    )
  )

Now, if I add a property of hl-handle:: foo to a document page, all the citations for that PDF are formatted like foo P103! Great!

Can I use this?

This is definitely a good proof of concept. Is it something I would want to use locally for a bit?

Wrapping up

To close the loop, I posted an update in the Discord. I wanted to make sure that if someone came across my request in the future, they wouldn’t find a “Oh yeah, I got it working!” message with no details. I also wanted to make sure I didn’t come across expecting my quick-and-dirty change to be adopted across the whole project.

I posted the following in the Discord thread.

OK, so for posterity’s sake, lemme post what I did. I added the following to block.cljs.

;; Get the ID of the page referred to by t's :block/page.
;; Use that ID to pull the page from the database, and only take the :block/properties map.
;; Then, look up the :hl-handle in the :block/properties, and bind it to awolf-hl-handle..
;; If any of those things are nil or don't exist, awolf-hl-handle is nil.

;; Then, if awolf-hl-handle is not nil, render a span.hl-handle with the handle and a space in it.
(let [awolf-hl-handle (:hl-handle (:block/properties (db-utils/pull [:block/properties] (:db/id (:block/page t)))))]
	(when awolf-hl-handle
		[:span.hl-handle
			[:strong.forbid-edit (str awolf-hl-handle " ")]]
		)
	)

This adds a db lookup for every single block render of an annotation, maybe for every block. The code is probably an abomination. I haven’t ever used Clojure before, and I don’t really know how logseq works in any way. This is a little personal proof-of-concept patch.

Anyway, after that, you can add an hl-handle:: foo property to the annotation page, and then those annotated things will show up with the handle prefix before the page number.

I hope developers add a PDF citation plugin hook. I’d like to see this feature without maintaining a fork, even as small as it is. I like the document handle, but there’s room for improvement. For instance, supporting custom citation formats! I’d like to see a chapter number, actually, even if only in exports. Zotero, for instance, supports thousands of citation formats. There's even a Citation Style Language supported by multiple tools! Giving an export to someone without the PDF becomes more useful when a citation reads “Gideon the Ninth, ch. 32, p. 409” rather than “P409”.

If this was helpful or enjoyable, please share it! To get new posts, subscribe to the newsletter or the RSS/Atom feed. If you have comments, questions, or feedback, please email me.