14
14
//! Unit. If any of the inputs changes from the last compilation, then the
15
15
//! Unit is considered dirty. A missing fingerprint (such as during the
16
16
//! first build) is also considered dirty.
17
- //! - Dirty propagation is done in the `JobQueue`. When a Unit is dirty, the
18
- //! `JobQueue` automatically treats anything that depends on it as dirty.
19
- //! Anything that relies on this is probably a bug. The fingerprint should
20
- //! always be complete (but there are some known limitations). This is a
21
- //! problem because not all Units are built all at once. If two separate
22
- //! `cargo` commands are run that build different Units, this dirty
23
- //! propagation will not work across commands.
17
+ //! - Whether or not input files are actually present. For example a build
18
+ //! script which says it depends on a nonexistent file `foo` is always rerun.
19
+ //! - Propagation throughout the dependency graph of file modification time
20
+ //! information, used to detect changes on the filesystem. Each `Fingerprint`
21
+ //! keeps track of what files it'll be processing, and when necessary it will
22
+ //! check the `mtime` of each file (last modification time) and compare it to
23
+ //! dependencies and output to see if files have been changed or if a change
24
+ //! needs to force recompiles of downstream dependencies.
24
25
//!
25
26
//! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
26
27
//! is notoriously imprecise and problematic. Only a small part of the
97
98
//!
98
99
//! After the list of Units has been calculated, the Units are added to the
99
100
//! `JobQueue`. As each one is added, the fingerprint is calculated, and the
100
- //! dirty/fresh status is recorded in the `JobQueue` . A closure is used to
101
- //! update the fingerprint on-disk when the Unit successfully finishes. The
102
- //! closure will recompute the Fingerprint based on the updated information.
103
- //! If the Unit fails to compile, the fingerprint is not updated.
101
+ //! dirty/fresh status is recorded. A closure is used to update the fingerprint
102
+ //! on-disk when the Unit successfully finishes. The closure will recompute the
103
+ //! Fingerprint based on the updated information. If the Unit fails to compile,
104
+ //! the fingerprint is not updated.
104
105
//!
105
106
//! Fingerprints are cached in the `Context`. This makes computing
106
107
//! Fingerprints faster, but also is necessary for properly updating
107
108
//! dependency information. Since a Fingerprint includes the Fingerprints of
108
109
//! all dependencies, when it is updated, by using `Arc` clones, it
109
110
//! automatically picks up the updates to its dependencies.
110
111
//!
112
+ //! ## Considerations for inclusion in a fingerprint
113
+ //!
114
+ //! Over time we've realized a few items which historically were included in
115
+ //! fingerprint hashings should not actually be included. Examples are:
116
+ //!
117
+ //! * Modification time values. We strive to never include a modification time
118
+ //! inside a `Fingerprint` to get hashed into an actual value. While
119
+ //! theoretically fine to do, in practice this causes issues with common
120
+ //! applications like Docker. Docker, after a layer is built, will zero out
121
+ //! the nanosecond part of all filesystem modification times. This means that
122
+ //! the actual modification time is different for all build artifacts, which
123
+ //! if we tracked the actual values of modification times would cause
124
+ //! unnecessary recompiles. To fix this we instead only track paths which are
125
+ //! relevant. These paths are checked dynamically to see if they're up to
126
+ //! date, and the modifiation time doesn't make its way into the fingerprint
127
+ //! hash.
128
+ //!
129
+ //! * Absolute path names. We strive to maintain a property where if you rename
130
+ //! a project directory Cargo will continue to preserve all build artifacts
131
+ //! and reuse the cache. This means that we can't ever hash an absolute path
132
+ //! name. Instead we always hash relative path names and the "root" is passed
133
+ //! in at runtime dynamically. Some of this is best effort, but the general
134
+ //! idea is that we assume all acceses within a crate stay within that
135
+ //! crate.
136
+ //!
137
+ //! These are pretty tricky to test for unfortunately, but we should have a good
138
+ //! test suite nowadays and lord knows Cargo gets enough testing in the wild!
139
+ //!
111
140
//! ## Build scripts
112
141
//!
113
142
//! The *running* of a build script (`CompileMode::RunCustomBuild`) is treated
114
143
//! significantly different than all other Unit kinds. It has its own function
115
- //! for calculating the Fingerprint (`prepare_build_cmd `) and has some unique
116
- //! considerations. It does not track the same information as a normal Unit.
117
- //! The information tracked depends on the `rerun-if-changed` and
144
+ //! for calculating the Fingerprint (`calculate_run_custom_build `) and has some
145
+ //! unique considerations. It does not track the same information as a normal
146
+ //! Unit. The information tracked depends on the `rerun-if-changed` and
118
147
//! `rerun-if-env-changed` statements produced by the build script. If the
119
148
//! script does not emit either of these statements, the Fingerprint runs in
120
149
//! "old style" mode where an mtime change of *any* file in the package will
@@ -357,28 +386,37 @@ pub struct Fingerprint {
357
386
/// "description", which are exposed as environment variables during
358
387
/// compilation.
359
388
metadata : u64 ,
360
- /// Whether any of the `local` inputs, on the filesystem, are considered out
361
- /// of date by looking at mtimes .
389
+ /// Description of whether the filesystem status for this unit is up to date
390
+ /// or should be considered stale .
362
391
#[ serde( skip_serializing, skip_deserializing) ]
363
392
fs_status : FsStatus ,
364
- /// A file , relative to `target_root`, that is produced by the step that
393
+ /// Files , relative to `target_root`, that are produced by the step that
365
394
/// this `Fingerprint` represents. This is used to detect when the whole
366
395
/// fingerprint is out of date if this is missing, or if previous
367
396
/// fingerprints output files are regenerated and look newer than this one.
368
397
#[ serde( skip_serializing, skip_deserializing) ]
369
398
outputs : Vec < PathBuf > ,
370
399
}
371
400
401
+ /// Indication of the status on the filesystem for a particular unit.
372
402
enum FsStatus {
403
+ /// This unit is to be considered stale, even if hash information all
404
+ /// matches. The filesystem inputs have changed (or are missing) and the
405
+ /// unit needs to subsequently be recompiled.
373
406
Stale ,
407
+
408
+ /// This unit is up-to-date, it does not need to be recompiled. If there are
409
+ /// any outputs then the `FileTime` listed here is the minimum of all their
410
+ /// mtimes. This is then later used to see if a unit is newer than one of
411
+ /// its dependants, causing the dependant to be recompiled.
374
412
UpToDate ( Option < FileTime > ) ,
375
413
}
376
414
377
415
impl FsStatus {
378
416
fn up_to_date ( & self ) -> bool {
379
417
match self {
380
418
FsStatus :: UpToDate ( _) => true ,
381
- _ => false ,
419
+ FsStatus :: Stale => false ,
382
420
}
383
421
}
384
422
}
@@ -443,19 +481,40 @@ enum StaleFile {
443
481
}
444
482
445
483
impl LocalFingerprint {
484
+ /// Checks dynamically at runtime if this `LocalFingerprint` has a stale
485
+ /// file.
486
+ ///
487
+ /// This will use the absolute root paths passed in if necessary to guide
488
+ /// file accesses.
446
489
fn find_stale_file (
447
490
& self ,
448
491
pkg_root : & Path ,
449
492
target_root : & Path ,
450
493
) -> CargoResult < Option < StaleFile > > {
451
494
match self {
495
+ // We need to parse `dep_info`, learn about all the files the crate
496
+ // depends on, and then see if any of them are newer than the
497
+ // dep_info file itself. If the `dep_info` file is missing then this
498
+ // unit has never been compiled!
452
499
LocalFingerprint :: CheckDepInfo { dep_info } => {
453
- find_stale_file_from_dep_info ( pkg_root, & target_root. join ( dep_info) )
500
+ let dep_info = target_root. join ( dep_info) ;
501
+ if let Some ( paths) = parse_dep_info ( pkg_root, & dep_info) ? {
502
+ Ok ( find_stale_file ( & dep_info, paths. iter ( ) ) )
503
+ } else {
504
+ Ok ( Some ( StaleFile :: Missing ( dep_info) ) )
505
+ }
454
506
}
507
+
508
+ // We need to verify that no paths listed in `paths` are newer than
509
+ // the `output` path itself, or the last time the build script ran.
455
510
LocalFingerprint :: RerunIfChanged { output, paths } => Ok ( find_stale_file (
456
511
& target_root. join ( output) ,
457
512
paths. iter ( ) . map ( |p| pkg_root. join ( p) ) ,
458
513
) ) ,
514
+
515
+ // These have no dependencies on the filesystem, and their values
516
+ // are included natively in the `Fingerprint` hash so nothing
517
+ // tocheck for here.
459
518
LocalFingerprint :: RerunIfEnvChanged { .. } => Ok ( None ) ,
460
519
LocalFingerprint :: Precalculated ( ..) => Ok ( None ) ,
461
520
}
@@ -518,6 +577,12 @@ impl Fingerprint {
518
577
ret
519
578
}
520
579
580
+ /// Compares this fingerprint with an old version which was previously
581
+ /// serialized to filesystem.
582
+ ///
583
+ /// The purpose of this is exclusively to produce a diagnostic message
584
+ /// indicating why we're recompiling something. This function always returns
585
+ /// an error, it will never return success.
521
586
fn compare ( & self , old : & Fingerprint ) -> CargoResult < ( ) > {
522
587
if self . rustc != old. rustc {
523
588
bail ! ( "rust compiler has changed" )
@@ -640,15 +705,33 @@ impl Fingerprint {
640
705
bail ! ( "current filesystem status shows we're outdated" ) ;
641
706
}
642
707
708
+ // This typically means some filesystem modifications happened or
709
+ // something transitive was odd. In general we should strive to provide
710
+ // a better error message than this, so if you see this message a lot it
711
+ // likely means this method needs to be updated!
643
712
bail ! ( "two fingerprint comparison turned up nothing obvious" ) ;
644
713
}
645
714
715
+ /// Dynamically inspect the local filesystem to update the `fs_status` field
716
+ /// of this `Fingerprint`.
717
+ ///
718
+ /// This function is used just after a `Fingerprint` is constructed to check
719
+ /// the local state of the filesystem and propagate any dirtiness from
720
+ /// dependencies up to this unit as well. This function assumes that the
721
+ /// unit starts out as `FsStatus::Stale` and then it will optionally switch
722
+ /// it to `UpToDate` if it can.
646
723
fn check_filesystem (
647
724
& mut self ,
648
725
pkg_root : & Path ,
649
726
target_root : & Path ,
650
727
mtime_on_use : bool ,
651
728
) -> CargoResult < ( ) > {
729
+ assert ! ( !self . fs_status. up_to_date( ) ) ;
730
+
731
+ // Get the `mtime` of all outputs. Optionally update their mtime
732
+ // afterwards based on the `mtime_on_use` flag. Afterwards we want the
733
+ // minimum mtime as it's the one we'll be comparing to inputs and
734
+ // dependencies.
652
735
let status = self
653
736
. outputs
654
737
. iter ( )
@@ -661,32 +744,59 @@ impl Fingerprint {
661
744
return mtime;
662
745
} )
663
746
. min ( ) ;
747
+
664
748
let mtime = match status {
665
- Some ( Some ( mtime) ) => mtime,
666
- Some ( None ) => return Ok ( ( ) ) ,
749
+ // We had no output files. This means we're an overridden build
750
+ // script and we're just always up to date because we aren't
751
+ // watching the filesystem.
667
752
None => {
668
753
self . fs_status = FsStatus :: UpToDate ( None ) ;
669
754
return Ok ( ( ) ) ;
670
755
}
756
+
757
+ // At least one path failed to report its `mtime`. It probably
758
+ // doesn't exists, so leave ourselves as stale and bail out.
759
+ Some ( None ) => return Ok ( ( ) ) ,
760
+
761
+ // All files successfully reported an `mtime`, and we've got the
762
+ // minimum one, so let's keep going with that.
763
+ Some ( Some ( mtime) ) => mtime,
671
764
} ;
672
765
673
766
for dep in self . deps . iter ( ) {
674
767
let dep_mtime = match dep. fingerprint . fs_status {
675
- FsStatus :: UpToDate ( Some ( mtime) ) => mtime,
676
- FsStatus :: UpToDate ( None ) => continue ,
768
+ // If our dependency is stale, so are we, so bail out.
677
769
FsStatus :: Stale => return Ok ( ( ) ) ,
770
+
771
+ // If our dependencies is up to date and has no filesystem
772
+ // interactions, then we can move on to the next dependency.
773
+ FsStatus :: UpToDate ( None ) => continue ,
774
+
775
+ FsStatus :: UpToDate ( Some ( mtime) ) => mtime,
678
776
} ;
777
+
778
+ // If the dependency is newer than our own output then it was
779
+ // recompiled previously. We transitively become stale ourselves in
780
+ // that case, so bail out.
679
781
if dep_mtime > mtime {
680
782
return Ok ( ( ) ) ;
681
783
}
682
784
}
785
+
786
+ // If we reached this far then all dependencies are up to date. Check
787
+ // all our `LocalFingerprint` information to see if we have any stale
788
+ // files for this package itself. If we do find something log a helpful
789
+ // message and bail out so we stay stale.
683
790
for local in self . local . iter ( ) {
684
791
if let Some ( file) = local. find_stale_file ( pkg_root, target_root) ? {
685
792
file. log ( ) ;
686
793
return Ok ( ( ) ) ;
687
794
}
688
795
}
796
+
797
+ // Everything was up to date! Record such.
689
798
self . fs_status = FsStatus :: UpToDate ( Some ( mtime) ) ;
799
+
690
800
Ok ( ( ) )
691
801
}
692
802
}
@@ -790,6 +900,11 @@ impl DepFingerprint {
790
900
}
791
901
792
902
impl StaleFile {
903
+ /// Use the `log` crate to log a hopefully helpful message in diagnosing
904
+ /// what file is considered stale and why. This is intended to be used in
905
+ /// conjunction with `RUST_LOG` to determine why Cargo is recompiling
906
+ /// something. Currently there's no user-facing usage of this other than
907
+ /// that.
793
908
fn log ( & self ) {
794
909
match self {
795
910
StaleFile :: Missing ( path) => {
@@ -835,9 +950,13 @@ fn calculate<'a, 'cfg>(
835
950
} else {
836
951
calculate_normal ( cx, unit) ?
837
952
} ;
953
+
954
+ // After we built the initial `Fingerprint` be sure to update the
955
+ // `fs_status` field of it.
838
956
let target_root = target_root ( cx, unit) ;
839
957
let mtime_on_use = cx. bcx . config . cli_unstable ( ) . mtime_on_use ;
840
958
fingerprint. check_filesystem ( unit. pkg . root ( ) , & target_root, mtime_on_use) ?;
959
+
841
960
let fingerprint = Arc :: new ( fingerprint) ;
842
961
cx. fingerprints . insert ( * unit, Arc :: clone ( & fingerprint) ) ;
843
962
Ok ( fingerprint)
@@ -1047,6 +1166,16 @@ fn build_script_local_fingerprints<'a, 'cfg>(
1047
1166
move |deps : & BuildDeps , pkg_fingerprint : Option < & dyn Fn ( ) -> CargoResult < String > > | {
1048
1167
if deps. rerun_if_changed . is_empty ( ) && deps. rerun_if_env_changed . is_empty ( ) {
1049
1168
match pkg_fingerprint {
1169
+ // FIXME: this is somewhat buggy with respect to docker and
1170
+ // weird filesystems. The `Precalculated` variant
1171
+ // constructed below will, for `path` dependencies, contain
1172
+ // a stringified version of the mtime for the local crate.
1173
+ // This violates one of the things we describe in this
1174
+ // module's doc comment, never hashing mtimes. We should
1175
+ // figure out a better scheme where a package fingerprint
1176
+ // may be a string (like for a registry) or a list of files
1177
+ // (like for a path dependency). Those list of files would
1178
+ // be stored here rather than the the mtime of them.
1050
1179
Some ( f) => {
1051
1180
debug ! ( "old local fingerprints deps" ) ;
1052
1181
let s = f ( ) ?;
@@ -1093,6 +1222,9 @@ fn local_fingerprints_deps(
1093
1222
let mut local = Vec :: new ( ) ;
1094
1223
1095
1224
if !deps. rerun_if_changed . is_empty ( ) {
1225
+ // Note that like the module comment above says we are careful to never
1226
+ // store an absolute path in `LocalFingerprint`, so ensure that we strip
1227
+ // absolute prefixes from them.
1096
1228
let output = deps
1097
1229
. build_script_output
1098
1230
. strip_prefix ( target_root)
@@ -1146,13 +1278,18 @@ pub fn prepare_init<'a, 'cfg>(cx: &mut Context<'a, 'cfg>, unit: &Unit<'a>) -> Ca
1146
1278
Ok ( ( ) )
1147
1279
}
1148
1280
1281
+ /// Returns the location that the dep-info file will show up at for the `unit`
1282
+ /// specified.
1149
1283
pub fn dep_info_loc < ' a , ' cfg > ( cx : & mut Context < ' a , ' cfg > , unit : & Unit < ' a > ) -> PathBuf {
1150
1284
cx. files ( )
1151
1285
. fingerprint_dir ( unit)
1152
1286
. join ( & format ! ( "dep-{}" , filename( cx, unit) ) )
1153
1287
}
1154
1288
1155
- pub fn target_root < ' a , ' cfg > ( cx : & mut Context < ' a , ' cfg > , unit : & Unit < ' a > ) -> PathBuf {
1289
+ /// Returns an absolute path that the `unit`'s outputs should always be relative
1290
+ /// to. This `target_root` variable is used to store relative path names in
1291
+ /// `Fingerprint` instead of absolute pathnames (see module comment).
1292
+ fn target_root < ' a , ' cfg > ( cx : & mut Context < ' a , ' cfg > , unit : & Unit < ' a > ) -> PathBuf {
1156
1293
if unit. mode . is_run_custom_build ( ) {
1157
1294
cx. files ( ) . build_script_run_dir ( unit)
1158
1295
} else if unit. kind == Kind :: Host {
@@ -1184,8 +1321,10 @@ fn compare_old_fingerprint(
1184
1321
let old_fingerprint_json = paths:: read ( & loc. with_extension ( "json" ) ) ?;
1185
1322
let old_fingerprint: Fingerprint = serde_json:: from_str ( & old_fingerprint_json)
1186
1323
. chain_err ( || internal ( "failed to deserialize json" ) ) ?;
1187
- assert_eq ! ( util:: to_hex( old_fingerprint. hash( ) ) , old_fingerprint_short) ;
1188
- new_fingerprint. compare ( & old_fingerprint)
1324
+ debug_assert_eq ! ( util:: to_hex( old_fingerprint. hash( ) ) , old_fingerprint_short) ;
1325
+ let result = new_fingerprint. compare ( & old_fingerprint) ;
1326
+ assert ! ( result. is_err( ) ) ;
1327
+ return result;
1189
1328
}
1190
1329
1191
1330
fn log_compare ( unit : & Unit < ' _ > , compare : & CargoResult < ( ) > ) {
@@ -1222,17 +1361,6 @@ pub fn parse_dep_info(pkg_root: &Path, dep_info: &Path) -> CargoResult<Option<Ve
1222
1361
}
1223
1362
}
1224
1363
1225
- fn find_stale_file_from_dep_info (
1226
- pkg_root : & Path ,
1227
- dep_info : & Path ,
1228
- ) -> CargoResult < Option < StaleFile > > {
1229
- if let Some ( paths) = parse_dep_info ( pkg_root, dep_info) ? {
1230
- Ok ( find_stale_file ( dep_info, paths. iter ( ) ) )
1231
- } else {
1232
- Ok ( Some ( StaleFile :: Missing ( dep_info. to_path_buf ( ) ) ) )
1233
- }
1234
- }
1235
-
1236
1364
fn pkg_fingerprint ( bcx : & BuildContext < ' _ , ' _ > , pkg : & Package ) -> CargoResult < String > {
1237
1365
let source_id = pkg. package_id ( ) . source_id ( ) ;
1238
1366
let sources = bcx. packages . sources ( ) ;
0 commit comments