-
Notifications
You must be signed in to change notification settings - Fork 3
Expose grid.evolve
to the C-API
#344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Hi @Radonirinaunimi, yes, it does make sense. |
Hi @vbertone, apologies for the late reply, I wanted to sort out the interface first before providing you with how the order of the flattening should look like. However, I had not anticipated how difficult the interface would be, and therefore this is not done yet. It might still takes some time. |
Hi @Radonirinaunimi, no problem at all. Take your time. |
Ok, this is not completely done (needs some clean up, fix some memory leak in the example, etc.) but I think we can start to test and experiment it. So there is a primary example in evolve-grid.cpp. The operators should be flattened following the std::size_t index = 0;
for a { for i { for b {for l { eko[index++] = ... } } } } The pineappl/examples/cpp/evolve-grid.cpp Lines 62 to 71 in 43b1619
Please let me know if something is wrong/doesn't make sense. |
This is now doing what we intended to do. @vbertone please test it and let me know. You can download the Grid for the test via: wget --no-verbose --no-clobber https://data.nnpdf.science/pineappl/test-data/LHCB_WP_7TEV_opt.pineappl.lz4 @cschwan This is not super-clean yet and would appreciate if you can have a quick look (your feedback is always quite helpful). |
Hi @Radonirinaunimi, thanks a lot, I will test it as soon as possible! |
Hi @Radonirinaunimi, I just ran the code and it does work fine. Also, starting from here I should be able to follow the steps to eventually interface it to the evolution operator computed with APFEL++. I just have one question. Here is the output I get:
So differences between grid and FK-table predictions are on the verge of 1%. Since the grid that is being used is (I guess) a realistic one and that the evolution is essentially the unity, this seems a purely numerical effect related, I suppose, to interpolation. The question is: is this accuracy satisfactory? And can in principle be improved? Thank you. |
Hey @vbertone and @felixhekhorn, there was a bug in the masking of the orders of the example (fixed in b2bf81b). Now the agreement is:
|
Thank you, @Radonirinaunimi, this looks much better! |
Hi @Radonirinaunimi, I'm having some difficulties in understanding the behaviour of the code. Specifically, I would like to change the output PIDs because evolution can possibly generate channels that are not originally present. To this purpose, I made the following basic changes to the code
The purpose is to allow for all possible channels. However, I must be doing something wrong because when I run the code I get this error that I cannot interpret:
I also tried to play around with the order of variables here and there, but still get the same error (except that sometimes the two tensor shapes in the error message are swapped). Any idea of what I'm doing wrong? Thank you. |
Hi @vbertone, the error is because a mismatched shape between the EKO |
Indeed, thanks @vbertone for trying that out. Note that this is (would be) significantly faster than with
Yes, I'd say the increase is time complexity is mainly due to the number of combinations, although all the other computations combined multiplies that (number of operators, scales, etc.).
You'd have to add Btw, do you know how much memory was required for you to complete the double evolution? Mine gets terminated mid-way on my old poor Dell XPS with 16gb of memory. In any case, I'll explore this option:
as at least this would alleviate the memory issue. |
Hi @Radonirinaunimi, using the coarse grid (10 nodes), the memory taken by the code is modest, around 100M. I guess I will wait for you to implement these ameliorations before venturing into editing the code myself :-). Thank you! |
Hmhmh, that is much lower than I expected. If this is really the case, improving on memory would not be as crucial.
Okok, I'll keep you updated, in particular given that such amelioration would not be straightforward. |
Hi @Radonirinaunimi, I just tried with a larger grid (~100 nodes) and indeed in that case the memory footprint is much larger, it's around 10G. Apparently, it scales quadratically with the number of nodes, which looks reasonable. So, in the end, memory could possibly be an issue. |
Hi @vbertone, @cschwan, in be81384, I wrapped the generation of evolution operators into a callback that gets passed to Rust. This way, only the operator that is needed for a given slices gets computed/loaded. With the minimal tests here, this at least leads to ~40% reduction in memory. @vbertone, could you please check that this can be easily translatable to also work with how APFEL++ generates the operators. If this works fine, then I'll clean up a bit the implementation and improve the functions' signatures. |
Hi @Radonirinaunimi, thanks. Yes, I'm looking into the code now and I think that this generally doable. However, I'm concerned by the fact that the callback function does not allow one to pass any theory settings (other than the convolution type), such as perturbative order, heavy-quark masses, alpha_s, etc, necessary to construct the evolution operator. Moreover, Would it be possible to pass to |
Hi @vbertone, we can't exactly do that since we're limited to what C (not C++) gives us, but we can and should do the next best thing: pass a |
Hi @vbertone! My bad, my thinking, at least for that first proof of concept, was to expose as little objects as possible by abstracting away all the I believe that in 9efc9bc the exposed function now contains the least amount of needed arguments to construct pineappl/pineappl_capi/src/lib.rs Lines 2101 to 2115 in 9efc9bc
Please let me know what you think. PS: @cschwan, in case you'd like to get access to the |
Hi @cschwan and @Radonirinaunimi, ok, understood, thanks. I will try to use this new interface to construct the evolution operator with |
@Radonirinaunimi, concerning the question of alpha_s, I don't think that's an issue. The function that returns alpha_s is fast enough that the overhead of computing it twice is most probably negligible. |
So, assuming that extern "C" void generate_fake_ekos(
const int* _pids_in,
const double* _x_in,
const int* pids_out,
const double* x_out,
double* eko_buffer,
void* pdf_state,
pineappl_conv_type conv_type,
double fac1,
std::size_t pids_in_len,
std::size_t x_in_len,
std::size_t pids_out_len,
std::size_t x_out_len
) {
// Ignore unused variables
(void) _pids_in;
(void) _x_in;
(void) pids_in_len;
(void) x_in_len;
// ------------------ Construct the Evolution Operator with APFEL++
// ------------------ Accumulate of thresholds and PIDs from LHAPDF set
std::vector<double> Thresholds; //{0, 0, 0, 1.3, 4.75};
std::vector<int> pids_in; //{-5, -4, -3, -2, -1, 21, 1, 2, 3, 4, 5};
const double mu0 = static_cast<LHAPDF::PDF*> (pdf_state)->qMin();
for (auto const& v : static_cast<LHAPDF::PDF*> (pdf_state)->flavors()) {
pids_in.push_back(v);
if (v > 0 && v < 7) {
Thresholds.push_back(static_cast<LHAPDF::PDF*> (pdf_state)->quarkThreshold(v));
}
}
// Get perturbative order from LHAPDF set
const int PerturbativeOrder = static_cast<LHAPDF::PDF*> (pdf_state)->qcdOrder();
// Construct running strong coupling
apfel::AlphaQCD a{static_cast<LHAPDF::PDF*> (pdf_state)->alphasQ(91.1876), 91.1876, Thresholds, PerturbativeOrder};
const apfel::TabulateObject<double> Alphas{a, 200, 0.9, 13001, 3};
const auto as = [&](double const& mu) -> double {
return Alphas.Evaluate(mu);
};
// Construct x-space grid for APFEL++ (can this be determined dynamically?)
// const apfel::Grid g{{apfel::SubGrid{100, 1e-5, 3}, apfel::SubGrid{100,
// 1e-1, 3}, apfel::SubGrid{100, 6e-1, 3}, apfel::SubGrid{80, 8.5e-1, 5}}};
const apfel::Grid g{
{apfel::SubGrid{80, 1e-5, 3}, apfel::SubGrid{40, 1e-1, 3}}};
// Get joint grid
std::vector<double> x_in(g.GetJointGrid().nx() + 1);
for (size_t i = 0; i < x_in.size(); i++)
x_in[i] = g.GetJointGrid().GetGrid()[i];
// The Evolution Operator is a vector with length `N_conv * N_Q2_slices * Σ
// product(OP shape)` Initial scale
const std::size_t flat_len = pids_out_len * x_out_len * pids_in.size() * x_in.size();
std::vector<std::size_t> shape = {pids_out_len, x_out_len, pids_in.size(), x_in.size()};
// Zero operator
const apfel::Operator ZeroOp{g, apfel::Null{}, apfel::eps5};
// Run over convolution types
int k = -1;
// Construct DGLAP objects for the relevant evolution and
// tabulate evolution operator
std::unique_ptr<apfel::Dglap<apfel::Operator>> EvDists;
if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_UNPOL_PDF)
EvDists = BuildDglap(InitializeDglapObjectsQCDPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
else if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_POL_PDF)
EvDists = BuildDglap(InitializeDglapObjectsQCDpolPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
else if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_UNPOL_FF)
EvDists = BuildDglap(InitializeDglapObjectsQCDTPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
else
throw std::runtime_error("Unknow convolution type");
const apfel::TabulateObject<apfel::Set<apfel::Operator>> TabulatedOps{
*EvDists, 100, 1, 13000, 3};
// NOTE: The Loop over the scale `fac1` is also removed
....
} with the main changes that the loops on |
Hi @Radonirinaunimi, yes, this does make sense and makes it much clearer. Thank you. I still have a doubt though. In this example the evolution parameters are retrieved from the LHAPDF set, which in turn is passed to the function that returns the evolution operators through |
Hmhm, I understand you mean by generic interface. Let me think a bit more.
Could you perhaps provide an example of such a general case? In this way, the next iteration could be based upon that, and therefore also general xd. |
Hi @Radonirinaunimi, you mean an example of implementation? At the moment I don't have anything specific in mind. What I mean is: does the current signature of the function that returns the evolution operator accommodate a situation in which the For example, one may define a structure which encapsulates the evolution parameters and pass an instance of that structure through |
Yes, as long as it is not a complicated object (such as function, although function pointers might still work). See for example the patch below for evolve-grid-identity.cpp: diff --git a/examples/cpp/evolve-grid-identity.cpp b/examples/cpp/evolve-grid-identity.cpp
index f68a094d..42624e81 100644
--- a/examples/cpp/evolve-grid-identity.cpp
+++ b/examples/cpp/evolve-grid-identity.cpp
@@ -12,6 +12,11 @@
// NOTE: Uses the scale of the Grid as the starting scale such that we can use an IDENTITY EKO.
double FAC0 = 6456.44;
+struct ApfelxxParams {
+ int x_nodes;
+ double as_ref;
+};
+
std::vector<std::size_t> unravel_index(std::size_t flat_index, const std::vector<std::size_t>& shape) {
std::size_t ndim = shape.size();
std::vector<std::size_t> coords(ndim);
@@ -30,7 +35,7 @@ extern "C" void generate_fake_ekos(
const int* pids_out,
const double* x_out,
double* eko_buffer,
- void* pdf_state,
+ void* params,
pineappl_conv_type conv_type,
double fac1,
std::size_t pids_in_len,
@@ -46,9 +51,10 @@ extern "C" void generate_fake_ekos(
(void) conv_type;
(void) fac1;
- // Check to get the μ0 from the PDF
- const double mu0_scale = static_cast<LHAPDF::PDF*> (pdf_state)->q2Min();
- (void) mu0_scale; // Mark as unused variable
+ // Check external APFEL++ parameters
+ ApfelxxParams* op_params = static_cast<ApfelxxParams*>(params);
+ std::cout << op_params->as_ref << "\n";
+ std::cout << op_params->x_nodes << "\n";
std::size_t flat_len = pids_in_len * x_in_len * pids_out_len * x_out_len;
// NOTE: The EKO has to have as shape: (pids_in, x_in, pids_out, x_out)
@@ -175,6 +181,12 @@ int main() {
alphas_table.push_back(alpha);
}
+ // ApfelxxParams
+ ApfelxxParams* op_params = new ApfelxxParams;
+ op_params->x_nodes = 10;
+ op_params->as_ref = 0.118;
+ auto apfelxx_params = static_cast<void*>(op_params);
+
std::vector<double> xi = {1.0, 1.0, 1.0};
// NOTE: The EKO has to have as shape: (pids_in, x_in, pids_out, x_out)
std::vector<std::size_t> tensor_shape = {pids_in.size(), x_in.size(), pids_out.size(), x_in.size()};
@@ -193,7 +205,7 @@ int main() {
// - `ren1`: values of the renormalization scales
// - `alphas_table`: values of alphas for each renormalization scales
pineappl_fk_table* fktable = pineappl_grid_evolve(grid, opinfo_slices.data(),
- generate_fake_ekos, max_orders.data(), pdf.get(), x_in.data(),
+ generate_fake_ekos, max_orders.data(), apfelxx_params, x_in.data(),
x_in.data(), pids_in.data(), pids_out.data(),
tensor_shape.data(), xi.data(), ren1.data(), alphas_table.data()); Would this be enough? PS: I should rename this to |
Hi @Radonirinaunimi, yes, that's precisely what I had in mind. I'll try that and let you know. Thank you. |
Hi @Radonirinaunimi and @cschwan, I managed to modify the code in Things seem to work fine. However, when computing the convolution between grids and evolution operators in the case of two convolution types, neither the memory footprint nor the performance seem to have got significantly better. I uploaded the corresponding code in the |
Hi @vbertone, out of curiosity, how are you estimating the memory usage? As for the runtime, I did not expect the speed to improve much to be honest. I am currently profiling the codes with the previous version and the current improvement but it is taking quite some time 😬 (I am using
while previously it was of the order of Gbs (I am still checking the stack). |
Hi @Radonirinaunimi, I just used
But if I use I finer grid (~100 nodes) I find:
So now the memory usage is around 10G. Let me know if there is anything you would like to discuss. |
Stupid question: How many nodes are used in the current remote version of |
The current version uses the coarse grid. The grid is controlled through these lines:
The first is perhaps excessively dense, the second is kind of fine, the third (which is the one commented you are probably using) is not accurate enough. |
Hi @vbertone, sorry the bit of a delay. Using the dense grid, I now get consistent benchmarks with yours:
In hindsight, I'd say this is a fact of life. The optimization above was meant to not store all of the evolution operators in memory all at once but only one at time. While here, the problem is really with a single operator (maybe this is what you meant to tell me since the beginning but I missed it). I am not sure (and I don't think) there is really around this tbh (cc @cschwan) other than using reasonable number of nodes (in our case, we always use 50 for which the accuracy is acceptable). Nevertheless, I'd say that if 100 nodes can still be run on a personal computer, that is not too bad. |
Hi @Radonirinaunimi, ok, then that's what it is. Actually, I was wondering whether SIHP could be somehow optimised. If I remember correctly, you were mentioning that all partonic channels contribute to the cross section, which would make 13^3 (or 11^3) channels. If so, I'm pretty sure that not all of them are independent. Wouldn't that allow us to reduce the amount of computations do be done for the combination? Also, I launched the combination with around 100 nodes yesterday evening and it's still running. Hopefully, I should get it done by today. But still, the computation time it's of the order of one day, which is pretty heavy. |
The SIHP grids are already optimized so that non-contributing channels are removed. See here for example. But on that note, I am actually wondering whether the memory and speed issue is related to the explicit double-precision representation of the grid (?). But I am just thinking out loud... |
Hi @Radonirinaunimi, the code finally converged (~20 hours) and here is the output:
The accuracy is not ideal but I suspect it has to do with the time-like evolution which does not precisely match that of the FF set that I'm using. But this is definitely encouraging. |
Hi @vbertone, that took quite some time 😅 As for the accuracy, that is indeed definitely because of mismatch in the evolution, which is also what I recall when I evolved these grids with The question is then: with a more reasonable number of grid nodes (say 40 or 50) would the accuracy still of the same order (without significant deterioration)? If so, then we can just run with coarse grid. |
Using
grid.evolve
to evolve grid with APFEL++ evolutionIn addition to performing the evolution on the fly and outputting theory predictions, APFEL++ also wants the possibility to dump the evolved grids on disk for future usage (aka FK tables). This effectively implies
grid.evolve
to be exposed at the C interface.@vbertone, the idea is then for you to provide an operator which is a rank-4 tensor. To be more illustrative, let's consider a simple DIS case. The convolution, for a fixed$Q^2$ , writes as:
where:
These form the dimensions of an EKO, ie$x$ -grid coordinates of the grid and FK table do not have to be same, and they do not have to be the same as what is actually inside the grid $-$ same is true for PID bases.
EKO[a][i][b][l]
. Notice that theThe C++ function that consumes the operator from APFEL++ would therefore tentatively look as follows:! Deprecated, see evolve-grid.cpp.and this is the function that'll dump the evolved grid. I think this is the best way to do it as opposed to what we discussed during the meeting.
So the summarize, what APFEL++ needs to provide (for a given value of$Q^2$ ), is a rank-4 tensor representing the EKO. In addition, we need the $x$ and PID values (can be in any basis) from which the EKO was computed $-$ this way we don't rely on much information from the grids to construct the EKOs.
Does this make sense?
Extra-changes included in this PR (not fully related):
get-subgrids.cpp
examplesTODO:
LHCB_WP_7TEV_opt.pineappl.lz4
grid for the test which currently is available at:EKO_LHCB_WP_7TEV.txt
which contains the flattened evolution operators