The lesson learned from refactoring rspotify
1 Preface
Recently, I and Mario are working on refactoring rspotify
, trying to improve performance, documentation, error-handling, data model and reduce compile time, to make it easier to use. (For those who has never heard about rspotify
, it is a Spotify HTTP SDK implemented in Rust).
I am partly focusing on polishing the data model, based on the issue created by Koxiaet.
Since rspotify
is API client for Spotify, it has to handle the request and response from Spotify HTTP API.
Generally speaking, the data model is something about how to structure the response data, and used Serde
to parse JSON response from HTTP API to Rust struct
, and I have learnt a lot Serde tricks from refactoring.
2 Serde Lesson
2.1 Deserialize JSON map to Vec based on its value.
An actions object which contains a disallows
object, allows to update the user interface based on which playback actions are available within the current context.
The response JSON data from HTTP API:
|
|
The original model representing actions was:
|
|
And Koxiaet gave great advice about how to polish Actions
:
Actions::disallows
can be replaced with aVec<DisallowKey>
orHashSet<DisallowKey>
by removing all entires whose value is false, which will result in a simpler API.
To be honest, I was not that familiar with Serde
before, after digging in its official documentation for a while, it seems there is now a built-in way to convert JSON map to Vec<T>
base on map’s value.
After reading the Custom serialization from documentation, there was a simple solution came to my mind, so I wrote my first customized deserialize function.
I created a dumb Actions
struct inside the deserialize
function, and converted HashMap
to Vec
by filtering its value.
|
|
The types should be familiar if you’ve used Serde
before.
If you’re not used to Rust then the function signature will likely look a little strange. What it’s trying to tell is that d will be something that implements Serde
’s Deserializer
trait, and that any references to memory will live for the 'de
lifetime.
2.2 Deserialize Unix milliseconds timestamp to Datetime
A currently playing object which contains information about currently playing item, and the timestamp
field is an integer, representing the Unix millisecond timestamp when data was fetched.
The response JSON data from HTTP API:
|
|
The original model was:
|
|
As before, Koxiaet made a great point about timestamp
and =progress_ms=(I will talk about it later):
CurrentlyPlayingContext::timestamp
should be achrono::DateTime<Utc>
, which could be easier to use.
The polished struct looks like:
|
|
Using the deserialize_with
attribute tells Serde
to use custom deserialization code for the timestamp
field. The
from_millisecond_timestamp
code is:
|
|
The code calls d.deserialize_u64
passing in a struct. The passed in struct implements Serde
’s Visitor
, and look like:
|
|
The struct DateTimeVisitor
doesn’t have any fields, it just a type implemented the custom visitor which delegates to parse the u64
.
Since there is no way to construct DataTime
directly from Unix millisecond timestamp, I have to figure out how to handle the
construction. And it turns out that there is a way to construct DateTime
from seconds and nanoseconds:
|
|
Thus, what I need to do is just convert millisecond to second and nanosecond:
|
|
The to_millisecond_timestamp
function is similar to from_millisecond_timestamp
, but it’s eaiser to implement, check
this PR for more detail.
2.3 Deserialize milliseconds to Duration
The simplified episode object contains the simplified episode information, and the duration_ms
field is an integer, which represents the episode length in milliseconds.
The response JSON data from HTTP API:
|
|
The original model was
|
|
As before without saying, Koxiaet pointed out that
SimplifiedEpisode::duration_ms
should be replaced with aduration
of typeDuration
, since a built-inDuration
type works better than primitive type.
Since I have worked with Serde
’s custome deserialization, it’s not a hard job for me any more. I easily figure out how to deserialize u64
to Duration
:
|
|
Now, the life is easier than before.
2.4 Deserialize milliseconds to Option
Let’s go back to CurrentlyPlayingContext
model, since we have replaced millisecond (represents as u32
) with Duration
, it makes sense to replace all millisecond fields to Duration
.
But hold on, it seems progress_ms
field is a bit different.
The progress_ms
field is either not present or a millisecond, the u32
handles the milliseconds, as its value might not be present in the response, it’s an Option<u32>
, so it won’t work with from_duration_ms
.
Thus, it’s necessary to figure out how to handle the Option
type, and the answer is in the documentation, the deserialize_option
function:
Hint that the
Deserialize
type is expecting an optional value.
This allows deserializers that encode an optional value as a nullable value to convert the null value into
None
and a regular value intoSome(value)
.
|
|
As before, the OptionDurationVisitor
is an empty struct implemented Visitor
trait, but key point is in order to work with
deserialize_option
, the OptionDurationVisitor
has to implement the visit_none
and visit_some
method:
|
|
The visit_none
method return Ok(None)
so the progress
value in the struct will be None, and the visit_some
delegates the parsing logic to DurationVisitor
via the deserialize_u64
call, so deserializing Some(u64)
works like the u64
.
2.5 Deserialize enum from number
An AudioAnalysisSection
model contains a mode
field, which indicates the modality(major or minor) of a track, the type of scle from which its melodic content is derived. This field will contain a 0 for minor
, a 1 for major
, or a -1 for no result.
The response JSON data from HTTP API:
|
|
The original struct representing AudioAnalysisSection
was like this, since mode
field was stored into a f32=(=f8
was a better choice for this case):
|
|
Koxiaet made a great point about mode
field:
AudioAnalysisSection::mode
andAudioFeatures::mode
aref32=s but should be =Option<Mode>=s where =enum Mode { Major, Minor }
as it is more useful.
In this case, we don’t need the Opiton
type and in order to deserialize enum from number, we firstly need to define a C-like enum:
|
|
And then, what’s the next step? It seems serde doesn’t allow C-like enums to be formatted as integers rather that strings in JSON natively:
|
|
Then the failed version is exactly what we want. I know that the serde’s official documentation has a solution for this case, the serde_repr crate provides alternative derive macros that derive the same Serialize and Deserialize traits but delegate to the underlying representation of a C-like enum.
Since we are trying to reduce the compiled time of rspotify, so we are cautious about introducing new dependencies. So a custom-made serialize function would be a better choice, it just needs to match
the number, and convert to a related enum value.
|
|
3 Move into module
Update:
2021-01-15
from(to)_millisecond_timestamp
have been moved into its modulemillisecond_timestamp
and rename them todeserialize
&serialize
from(to)_duration_ms
have been moved into its moduleduration_ms
and rename them todeserialize
&serialize
from(to)_option_duration_ms
have been moved into its moduleoption_duration_ms
and rename them todeserialize
&serialize
4 Summary
To be honest, it’s the first time I have needed some customized works, which took me some time to understand how does Serde
works. Finally, all investments paid off, it works great now.
Serde is such an awesome deserialize/serialize framework which I have learnt a lot of from and still have a lot of to learn from.
5 Reference
- Deserializing optional datetimes with serde
- PR: Keep polishing the models
- PR: Refactor model
- PR: Deserialize enum from number