-
**The PDE/problem specification is not stated and the manuscript’s repeated claim of “2D Burgers” conflicts with the described data fields (x:101, t:103; no second spatial coordinate).** This prevents readers from validating what equation is solved, what “2D” means (2 spatial dims vs 1D-in-space + time), and what physical regimes are expected as $\nu$ varies (Sec. 2.1, Sec. 3, Sec. 4). The dataset naming (e.g., “turbulence bundle”) further increases ambiguity about whether this is a single trajectory, a random-IC ensemble, forced/decaying Burgers, etc. *Recommendation:* In Sec. 2.1, explicitly write the governing PDE(s) (including dimensionality, variables, and all terms), specify spatial/temporal domains, initial and boundary conditions, and clarify whether this is 1+1D Burgers (one space + time) or genuinely 2D-in-space Burgers (and if so, where the missing coordinate/fields are in the data). State whether each $\nu$ corresponds to a single IC/trajectory or an ensemble, and reconcile any “turbulence” terminology with the actual setup.
-
**The PINN and training setup are under-specified, making it impossible to assess whether the latent-space trends reflect physics/learning or artifacts of a particular model/run.** Missing: architecture (depth/width/activations), loss terms and weights (PDE residual vs data/IC/BC), collocation and sampling strategy, optimizer schedule, stopping criteria, training diagnostics, and—crucially—whether a single multi-$\nu$ model was trained ($\nu$ as an input) or separate models per $\nu$ (Sec. 2.1, Sec. 4). *Recommendation:* Expand Sec. 2.1 (or add a dedicated Methods subsection) to fully document the PINN: architecture (including where the 10D layer sits and whether it is pre/post activation), all loss components and weights, sampling/collocation details, optimizer and learning-rate schedule, training duration and convergence diagnostics, and whether training is joint across all $\nu$ or separate per $\nu$. Provide at least basic solution-quality validation across viscosities (e.g., PDE residual statistics, BC/IC error, or comparison to a reference solver for a few $\nu$) so the latent analysis is grounded in accurate solutions.
-
**The object of study (“10D latent space”) is not adequately defined or justified.** The manuscript does not clearly identify which layer is used, why that layer is representative, whether it is a bottleneck vs simply a hidden layer of width 10, and whether similar conclusions hold for other layers (Sec. 2.1, Sec. 3.2–3.3). This limits interpretability and generality. *Recommendation:* In Sec. 2.1, precisely identify the layer (index/depth; pre- vs post-nonlinearity; activation function) and justify why its activations are treated as “latent.” Add a minimal layer-wise comparison (e.g., one earlier and one later layer) to test whether (i) low effective dimension, (ii) “stable” PC orientations, and (iii) correlation-dimension behavior persist. If not feasible, explicitly scope conclusions to this layer in Sec. 4.
-
**Per-viscosity standardization (Sec. 2.2) materially changes what PCA measures and may affect cross-ν comparisons.** Standardizing each $\nu$ separately makes PCA reflect correlation structure rather than absolute variance/scales, and can alter explained-variance trends and PC alignment across viscosities. The paper does not discuss these implications or provide sensitivity checks (Sec. 2.2–2.4, Sec. 3.2). *Recommendation:* Explicitly state, in Sec. 2.2–2.4 and again in Sec. 3.2, that PCA is performed on per-$\nu$ standardized activations and interpret EVR trends accordingly. Add a sensitivity analysis comparing at least: (i) per-$\nu$ standardization (current), (ii) global standardization using pooled mean/std across all $\nu$, and ideally (iii) no standardization (with careful interpretation). Consider adding a ‘pooled PCA’ (fit PCA on all $\nu$ jointly) and then analyze how per-$\nu$ covariance/EVR projects onto this global basis; this directly tests the “stable basis” claim.
-
**PCA “stability” claims (cosine similarity of $\text{PC}_k$ between $\nu_i$ and $\nu_{i+1}$) are potentially confounded by eigenvalue near-degeneracies and ordering/permutation ambiguity.** High cosine similarity can be misleading if PC2–PC4 eigenvalues are close and eigenvectors rotate within a near-degenerate subspace; comparing only successive viscosities can also mask cumulative drift (Sec. 2.4, Sec. 3.2.2, Table 3). *Recommendation:* In Sec. 3.2.2, report eigenvalue gaps (e.g., $\lambda_k/\lambda_{k+1}$ or $\lambda_k-\lambda_{k+1}$) to show when individual PCs are well-defined. Replace/augment per-component cosine similarities with subspace similarity metrics (principal angles) for the span of the top-$m$ PCs. Add an all-pairs similarity heatmap or similarity-to-a-fixed-reference-$\nu$ plot to detect long-range drift. Keep the sign-handling (absolute cosine) but also handle potential component swaps by matching PCs via maximum absolute dot product when eigenvalues are close.
-
**The Grassberger–Procaccia correlation-dimension estimation is insufficiently specified and lacks uncertainty/robustness analysis, yet it supports a central claim (non-monotonic $D_2$ peak at intermediate $\nu$).** Key missing details include: distance metric; whether standardized or raw latent points are used; $\epsilon$ range and sampling; how the scaling region is selected; handling of strong spatial/temporal correlations in the $(x,t)$ grid; and any computational approximations for $N\approx 10^4$ (Secs. 2.5, 3.3). *Recommendation:* Substantially expand Sec. 2.5 to document the exact implementation: metric, $\epsilon$ grid (min/max, log spacing, number of radii), scaling-region selection procedure (e.g., sliding-window fits with $R^2$ thresholds), and computational approach (full pairs vs subsampled pairs/k-d tree). Because points come from a structured $(x,t)$ grid, incorporate correlation handling (e.g., Theiler window in time, spatial subsampling, or block bootstrap). In Sec. 3.3, add uncertainty quantification (bootstrap over points/blocks and over fit windows) and include representative $\log C(\epsilon)$ vs $\log \epsilon$ plots with the fitted scaling region for low/intermediate/high $\nu$ (appendix acceptable). Also test sensitivity of $D_2$ to $\epsilon$-range choices and to subsampling/resolution of the $(x,t)$ grid.
-
**The reported intrinsic dimension ($\approx 1.5$–$1.75$) and PCA effective dimension ($\approx 3$–4) are discussed as if directly comparable, but the sampling geometry strongly constrains the latent point cloud:** for fixed $\nu$, $\mathrm{latent}(x,t;\nu)$ is the image of a 2D parameter domain $(x,t)$ under a smooth map, so intrinsic dimension is expected to be $\leq 2$ in generic settings. Without addressing this “domain-mapping” viewpoint, the correlation-dimension results may be an artifact of structured sampling rather than evidence of an emergent manifold complexity trend with $\nu$ (Sec. 3.3–3.4). *Recommendation:* In Sec. 3.3–3.4, explicitly discuss that the point cloud is generated by a mapping from a 2D grid $(x,t)$ and is not i.i.d.; explain how this naturally yields $D_2$ near 2 (or below due to correlations/finite-size effects). To strengthen interpretation, test whether $D_2$ is stable under changes in grid resolution and under random subsampling of $(x,t)$ points. Clarify why PCA may require $>2$ components to capture variance (e.g., curved 2D surface embedded in 10D), and separate this geometric explanation from any stronger “RG-like” claims.
-
**The RG analogy is presented as a central interpretive lens but remains metaphorical; the manuscript does not define an explicit coarse-graining or a flow with RG-like properties (semigroup structure, fixed points, universality).** Current wording risks over-claiming relative to the presented evidence (Secs. 2.6, 3.4, 4). *Recommendation:* Revise Sec. 3.4 and Sec. 4 to clearly label the RG connection as heuristic and specify the limited mapping being proposed (e.g., $\nu$ as a control/scale-like parameter; leading latent modes as “effective” degrees of freedom). Remove or soften language suggesting a rigorous RG correspondence unless you add an explicit coarse-graining/transformation and demonstrate RG-like behavior (e.g., fixed-point-like stabilization, monotone flow of a quantity) on the latent statistics.
-
**Reproducibility is currently too limited for a scientific contribution: missing code/data availability statements; incomplete description of the exact pipeline from the .npy bundle to figures/tables; and corrupted/placeholder artifacts undermine confidence in reported numbers (Secs. 2.3–2.5, Sec. 3.1–3.2.1, Sec. 4).** *Recommendation:* Add a Data/Code Availability section (Sec. 4 or end matter) stating whether the trained model, activation bundles, and analysis scripts will be released. Provide a concise end-to-end pipeline description or pseudocode (loading, reshaping, standardization conventions, PCA implementation details, cosine-similarity computation, $D_2$ estimation settings). Ensure all tables/figures are regenerated from source outputs and remove placeholders/corrupt entries before submission.