Support hourly data#54
Conversation
|
@meiertgrootes and @rogerkuou I implemented the support for hourly data and ran the notebook on dkrz jupyter notebook.
|
meiertgrootes
left a comment
There was a problem hiding this comment.
Very nice implementation, including the elegant support for both CPU ad GPU. Some minor comments, but non-blocking for the merge.
In particular, with the envisaged move of months to batch dimension, it would be good to consider dropping the month positional encoding.
Another low priority point (maybe before final relase) is the use of daily_* for the input values and monthly_* for target. the monthly is fine, but the daily can also be hourly. Maybe we can consider renaming somehow. Clearly not crucial
| daily_da, monthly_da, time_dim=time_dim | ||
| ) | ||
|
|
||
| # Convert to numpy once — all __getitem__ calls use these |
There was a problem hiding this comment.
We should probably update tis comment as the conversion format is now a torch tensor rather than numpy
| # Precompute the NaN mask before filling NaNs | ||
| # daily_mask: True where NaN (i.e. missing ocean data, not land) | ||
| self.daily_nan_mask = np.isnan(self.daily_np) # (M, T=31, H, W) | ||
| self.daily_nan_mask = torch.isnan(self.daily_t) # (M, T=31, H, W) |
There was a problem hiding this comment.
adding the _t suffix to tensors for clarity as you've done above is a really good idea. Should we adopt that consistently, i.e. also for the daily_nan_mask? Or do you think that is not needed?
There was a problem hiding this comment.
agreed, it is fixed.
| @@ -27,16 +24,14 @@ class VideoEncoder(nn.Module): | |||
| https://arxiv.org/abs/2203.12602 | |||
| """ | |||
|
|
|||
There was a problem hiding this comment.
You have chosen to get rid of dropout regularization throughout? Could you expand on why in the PR?
This will also propagate to the projection NN used for spatial embeddings. likely not a significant issue.
There was a problem hiding this comment.
Good point. To fix the performance and investigate the bottlenecks, I used the PyTorch profiler with record_function. One thing that stood out was the large number of aten::dropout and aten::bernoulli calls, under the class VideoEncoder. I originally added dropout to help with overfitting. But we are now using torch.optim.AdamW and a validation dataset during training. I tested both keeping and removing the dropout in this class. Removing it improved the runtime without affecting the results, so I decided to remove it here. I left the dropout in some other classes because they didn't show up as bottlenecks in the profiler results.
| max_days: Maximum length of the temporal dimension to precompute | ||
| encodings for. Default is 31, which is sufficient for a month of | ||
| daily data. | ||
| max_months: Maximum number of months (temporal patches) to precompute |
There was a problem hiding this comment.
We have dropped the day encodings in favor of doy and hod sensitive cyclical encodings. We have kept the explicit month encoding, which will provide varying encodings depending on the length of the sequence in months used and/or will potentially provide the same encodings for different months if a different startig point in the sequnce of months is used.
Should we consider dropping this or/and adding a month-of-year feature to the cyclical encoding module?
There was a problem hiding this comment.
Good point. As you suggested we can implement month-of-year. This issue is related to PR #64.
| ) | ||
|
|
||
| def forward(self, x, M, T, H, W, time_features, padded_days_mask=None): | ||
| # Pre-compute and register as buffer — auto-moves with .to(device/dtype) |
| seq = seq + temp_emb # add temporal embeddings | ||
| seq = seq + pe_months[None, None, :, None, :] # add month PE | ||
| temp_emb = self.time_embed(time_features) | ||
| pe_months = self.pe_months_cache[:M] |
There was a problem hiding this comment.
see above. In addition, if we move to folding months into the batch dimension, adding a month-of-year to the cyclic time embedding would preserve the information in a simple manner.
| source $HOME/.local/bin/env | ||
| ``` | ||
|
|
||
| 2. Create a new conda environment and install ipykernel and climanet: |
There was a problem hiding this comment.
Technically this isn't a conda environment, but a virtual environment crreated (and managed) by uv.
The reason to be pedantic about this, is that otherwise people might try to use (exisiting) conda installs to modify the environment.
It is possible to use uv with conda, but that requires it's own specific setup (and is not recommended afaik)
closes #50
This PR:
Issues found: